优化或加速从.xy文件读取到excel

我有几个.xy文件（2列x和y值）。我一直在尝试读取所有这些文件，并将“y”值粘贴到一个excel文件中（所有这些文件中的“x”值是相同的）。我到目前为止的代码逐个读取文件，但它非常慢（每个文件大约需要20秒）。我有相当多的.xy文件，时间大大增加。我到现在为止的代码是：

import os,fnmatch,linecache,csv from openpyxl import Workbook wb = Workbook() ws = wb.worksheets[0] ws.title = "Sheet1" def batch_processing(file_name): row_count = sum(1 for row in csv.reader(open(file_name))) try: for row in xrange(1,row_count): data = linecache.getline(file_name, row) print data.strip().split()[1] print data ws.cell("A"+str(row)).value = float(data.strip().split()[0]) ws.cell("B"+str(row)).value = float(data.strip().split()[1]) print file_name wb.save(filename = os.path.splitext(file_name)[0]+".xlsx") except IndexError: pass workingdir = "C:\Users\Mine\Desktop\P22_PC" os.chdir(workingdir) for root, dirnames, filenames in os.walk(workingdir): for file_name in fnmatch.filter(filenames, "*_Cs.xy"): batch_processing(file_name)

任何帮助表示赞赏。谢谢。

我认为你的主要问题是，你正在写入Excel，并保存在文件的每一行，为目录中的每一个文件。我不确定实际将值写入Excel需要多长时间，但是只需将循环save移出循环并只保存一次所有内容就可以节省一点时间。另外，这些文件有多大？如果它们很大，那么linecache可能是一个好主意，但是假设它们不是太大，那么你可能没有它。

 def batch_processing(file_name): # Using 'with' is a better way to open files - it ensures they are # properly closed, etc. when you leave the code block with open(filename, 'rb') as f: reader = csv.reader(f) # row_count = sum(1 for row in csv.reader(open(file_name))) # ^^^You actually don't need to do this at all (though it is clever :) # You are using it now to govern the loop, but the more Pythonic way is # to do it as follows for line_no, line in enumerate(reader): # Split the line and create two variables that will hold val1 and val2 val1, val2 = line print val1, val2 # You can also remove this - printing takes time too ws.cell("A"+str(line_no+1)).value = float(val1) ws.cell("B"+str(line_no+1)).value = float(val2) # Doing this here will save the file after you process an entire file. # You could save a bit more time and move this to after your walk statement - # that way, you are only saving once after everything has completed wb.save(filename = os.path.splitext(file_name)[0]+".xlsx")

优化或加速从.xy文件读取到excel

文件打开提示function代码不工作 – 不知道为什么

用Python加载excel文件块，而不是将完整的文件加载到内存中

使用VBA FileSystemObject，指定文件的文件扩展名

比较多行并在R或Excel中创buildmatrix

无法读取Excel文件 – File.open无效

Excel数据连接的相对文件path

Excel VBA无法保存包含variables保留date值的文件名

VBA打开没有扩展名的文件夹并保存为不同的格式

写一个列表到Excel中

将文件合并到xlsx中，然后重新构build目录