Python – 用csv.DictReader忽略len()中的空单元格

在比较CSV文件中的行时,我偶然发现了问题。

我可以用len()和csv.reader,它工作得很好,但我必须sorting文件的关键。

我有唯一的键,所以我想使用DictReader但len()似乎读取字典中的所有值包括空单元格:

with open (baseline, 'r') as baselineF: readBaseline=csv.DictReader(baselineF, delimiter=',', quotechar='"') for rowb in readBaseline: print('rowb: ',len(rowb)) with open (tested, 'r') as testedF: readTested=csv.DictReader(testedF, delimiter=',', quotechar='"') for rowt in readTested: print ('rowt: ', len(rowt)) # Rows are the same len if len(rowb)==len(rowt): writerSameOracle.writerow(rowb) writerSameHPCC.writerow(rowt) print ('Rows are the same') break 

即使行具有相同数量的填充单元,也可以使用此代码,它将返回len()=每个文件中的标题数。

你在做什么似乎有点混淆,但过滤出任何错误的东西是微不足道的:

 >>> rowb = [1,2,0,3] # using list comprehension >>> len([x for x in rowb if x]) 3 # alternatively using filter in Python 2 >>> len(filter(None, rowb)) 3 

所以我决定加载字典的值列表,然后计算len()。 在此基础上使用适当的if语句来完成这项工作。

 with open (baseline, 'r') as baselineF: readBaseline=csv.DictReader(baselineF,delimiter=',', quotechar='"') for rowb in readBaseline: with open (tested, 'r') as testedF: readTested=csv.DictReader(testedF, delimiter=',', quotechar='"') for rowt in readTested: if rowt['key'] == rowb['key']: for value in rowb.values(): list1.append(value) cleaned1 = [x for x in list1 if x != None] list1=[] for value in rowt.values(): list2.append(value) cleaned2 = [x for x in list2 if x != None] list1=[] #rowb baseline #rowt tested #Rows are the same len if len(cleaned1)==len(cleaned2): writerSameOracle.writerow(rowb) writerSameHPCC.writerow(rowt) print ('Rows are the same) break