如何使用python连接三个excels文件xlsx?

你好,我想连接三个Excel文件xlsx使用Python。

我曾尝试使用openpyxl,但我不知道哪个函数可以帮助我将三个工作表添加到一个。

你有什么想法如何做到这一点?

非常感谢

我会使用xlrd和xlwt 。 假设你真的只需要附加这些文件(而不是做任何真正的工作),我会做这样的事情:打开一个文件写入xlwt ,然后为每个其他三个文件,循环数据并将每行添加到输出文件。 为了让你开始:

 import xlwt import xlrd wkbk = xlwt.Workbook() outsheet = wkbk.add_sheet('Sheet1') xlsfiles = [r'C:\foo.xlsx', r'C:\bar.xlsx', r'C:\baz.xlsx'] outrow_idx = 0 for f in xlsfiles: # This is all untested; essentially just pseudocode for concept! insheet = xlrd.open_workbook(f).sheets()[0] for row_idx in xrange(insheet.nrows): for col_idx in xrange(insheet.ncols): outsheet.write(outrow_idx, col_idx, insheet.cell_value(row_idx, col_idx)) outrow_idx += 1 wkbk.save(r'C:\combined.xls') 

如果你的文件都有一个标题行,你可能不想重复,所以你可以修改上面的代码看起来更像这样:

 firstfile = True # Is this the first sheet? for f in xlsfiles: insheet = xlrd.open_workbook(f).sheets()[0] for row_idx in xrange(0 if firstfile else 1, insheet.nrows): pass # processing; etc firstfile = False # We're done with the first sheet. 

这是一个基于pandas的方法。 (它在幕后使用了openpyxl 。)

 import pandas as pd # filenames excel_names = ["xlsx1.xlsx", "xlsx2.xlsx", "xlsx3.xlsx"] # read them in excels = [pd.ExcelFile(name) for name in excel_names] # turn them into dataframes frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in excels] # delete the first row for all frames except the first # ie remove the header row -- assumes it's the first frames[1:] = [df[1:] for df in frames[1:]] # concatenate them.. combined = pd.concat(frames) # write it out combined.to_excel("c.xlsx", header=False, index=False)