合并两个表(CSV)if(table1 column A == table2 column A)
我有两个CSV,可在Numbers或Excel中打开,结构化:
| word | num1 |
和
| word | num2 |
如果这两个词是相等的(就像他们都是“嗨”和“嗨”),我希望它成为:
| word | num1 | num2 |
这里有一些图片:
所以就像第一行一样,因为两个单词是相同的,“真”,我希望它变成类似的东西
| TRUE | 5.371748 | 4.48957 |
要么通过一些小脚本,要么有一些我忽略的function/function。
谢谢!
使用字典:
with open('file1.csv', 'rb') as file_a, open('file2.csv', 'rb') as file_b: data_a = csv.reader(file_a) data_b = dict(csv.reader(file_b)) # <-- dict with open('out.csv', 'wb') as file_out: csv_out = csv.writer(file_out) for word, num_a in data_a: csv_out.writerow([word, num_a, data_b.get(word, '')]) # <-- edit
(另)
对于csv
,我总是到达数据分析库pandas
。 http://pandas.pydata.org/
import pandas as pd df1 = pd.read_csv('file1.csv', names=['word','num1']) df2 = pd.read_csv('file2.csv', names=['word','num2']) df3 = pd.merge(df1, df2, on='word') df3.to_csv('merged_data.csv')
我想你正在寻找的是zip
,让你在锁步中迭代两个CSV:
with open('file1.csv', 'rb') as f1, open('file2.csv', 'rb') as f2: r1, r2 = csv.reader(f1), csv.reader(f2) with open('out.csv', 'wb') as fout: w = csv.writer(fout) for row1, row2 in zip(r1, r2): if row1[0] == row2[0]: w.writerow([row1[0], row1[1], row2[1]])
如果不相等,我不确定你想要发生什么。 也许插入这两行,像这样?
else: w.writerow([row1[0], row1[1], '']) w.writerow([row2[0], '', row2[1]])