python csv将所有行格式化为一行

我是一个csv文件,我想获得一列中的所有行。 我试图导入到MS Excel或格式与Notedpad ++。 但是,每次尝试都会将一段数据视为新行。 如何使用pythons csv模块格式化文件,以便删除string“BRAS”并更正格式。 每一行都在一个引号和分隔符之间被find|更新:

"aa|bb|cc|dd| ee|ff" "ba|bc|bd|be| bf" "ca|cb|cd| ce|cf" 

以上是3行,但我的编辑看到他们5行或6等等。

 import csv import fileinput with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w: for line in f: if 'BRAS' not in line: w.write(line) 

NB当我尝试在python中使用时遇到unicode错误。

  return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 18: character maps to <undefined> 

这是一个小input文件的快速入侵(内容被读取到内存)。

 #!python2 fnameIn = 'ventoya.csv' fnameOut = 'ventoya2.csv' with open(fnameIn) as fin, open(fnameOut, 'w') as fout: data = fin.read() # content of the input file data = data.replace('\n', '') # make it one line data = data.replace('""', '|') # split char instead of doubled "" data = data.replace('"', '') # remove the first and last " print data for x in data.split('|'): # split by bar fout.write(x + '\n') # write to separate lines 

或者,如果目标只是修复额外的(不需要的)换行形成单列CSV文件,则可以先修复该文件,然后通过csv模块读取:

 #!python2 import csv fnameIn = 'ventoya.csv' fnameFixed = 'ventoyaFixed.csv' fnameOut = 'ventoya2.csv' # Fix the input file. with open(fnameIn) as fin, open(fnameFixed, 'w') as fout: data = fin.read() # content of the file data = data.replace('\n', '') # remove the newlines data = data.replace('""', '"\n"') # add the newlines back between the cells fout.write(data) # It is an overkill, but now the fixed file can be read using # the csv module. with open(fnameFixed, 'rb') as fin, open(fnameOut, 'wb') as fout: reader = csv.reader(fin) writer = csv.writer(fout) for row in reader: writer.writerow(row) 

为了解决这个问题,你不需要去编码。 1:只需在Notepad ++ 2中打开文件:在第一行中select| 直到下一行为止3:用|replace并replace选中的格式

search模式可以正常或扩展:)

那么,因为换行是一致的,所以你可以按照build议去find/replace,但是你也可以用你的python脚本快速转换:

 import csv import fileinput linecount = 0 with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w: for line in f: line = line.rstrip() # remove unwanted breaks by concatenating pairs of rows if linecount%2 == 0: line1 = line else: full_line = line1 + line full_line = full_line.replace(' ','') # remove spaces from front of 2nd half of line # if you want comma delimiters, uncomment next line: # full_line = full_line.replace('|',',') if 'BRAS' not in full_line: w.write(full_line + '\n') linecount += 1 

这适用于我的testing数据,如果你想在写入文件时改变分隔符,你可以。 使用代码的好处是:1.您可以使用代码(总是有趣)和2.您可以删除换行符并同时过滤内容到写入的文件。

Interesting Posts