如何从一个Excel工作簿提取数据和输出到另一个使用Python xlrd / xlwt?

我正在尝试编写一个脚本,用于将员工时间表从多个文件复制/粘贴到一个编译文件。 由于他们是带有项目代码的时间表,因此当天员工在其他项目上工作时,某些单元格会留空。 此外,文件已经从xlsx(2007)转换为.csv.xls,这似乎打开xlrd就好了。

我知道如何打开和创build一个图书对象,但是我对这个模块的知识是非常有限的,所以我想也许一个通用algorithm会有帮助:

import xlrd, xlwt put all following in for or while loop to iterate through files: book = xlrd.open_workbook('mybook.csv.xls') extract data; store data for ouput use for loop to iterate over data, output to final sheet open next file, repeat process storing each output below the previous 

我正在寻找任何有助于我find答案的东西,而不仅仅是代码。 任何帮助,将不胜感激。 谢谢。

这可能有助于…尽可能地复制您的数据(date保留为date,空单元格不会成为长度为0的文本单元格,布尔值和错误单元格不会成为数字单元格)。

 from xlrd import XL_CELL_EMPTY, XL_CELL_TEXT, XL_CELL_NUMBER, XL_CELL_DATE, XL_CELL_BOOLEAN, XL_CELL_ERROR, open_workbook from xlwt import Row, easyxf, Workbook method_for_type = { XL_CELL_TEXT: Row.set_cell_text, XL_CELL_NUMBER: Row.set_cell_number, XL_CELL_DATE: Row.set_cell_number, XL_CELL_ERROR: Row.set_cell_error, XL_CELL_BOOLEAN: Row.set_cell_boolean, } date_style = easyxf(num_format_str='yyyy-mm-dd') other_style = easyxf(num_format_str='General') def append_sheet(rsheet, wsheet, wrowx=0): for rrowx in xrange(rsheet.nrows): rrowvalues = rsheet.row_values(rrowx) wrow = wsheet.row(wrowx) for rcolx, rtype in enumerate(rsheet.row_types(rrowx)): if rtype == XL_CELL_EMPTY: continue wcolx = rcolx wmethod = method_for_type[rtype] wstyle = date_style if rtype == XL_CELL_DATE else other_style wmethod(wrow, wcolx, rrowvalues[rcolx], wstyle) wrowx += 1 return wrowx if __name__ == '__main__': import sys, xlrd, xlwt, glob rdpattern, wtfname = sys.argv[1:3] wtbook = Workbook() wtsheet = wtbook.add_sheet('guff') outrowx = 0 for rdfname in glob.glob(rdpattern): rdbook = open_workbook(rdfname) rdsheet = rdbook.sheet_by_index(0) outrowx = append_sheet(rdsheet, wtsheet, outrowx) print outrowx wtbook.save(wtfname) 

我正在为xlutils,xlrd和xlwt创build一个名为excel函数的类,我最终可能会创build一个库。 如果您有兴趣帮助我正在尝试删除工作表function。

您可能想要转向openpyxl和/或pyexcel,因为它们更简单,并且具有此function。

这里是如何使用打开的pyxl 复制 : 复制整个工作表与openpyxl

这是pyexcel的文档,它是xlwt,xlrd和xlutils的包装器: https ://pyexcel.readthedocs.io/en/latest/

如果要从一个Excel工作簿中提取数据并输出到另一个工作簿,则需要使用createCopy(原始工作簿,其他工作簿,原始文件名,新文件名)

 import xlwt import xlrd import xlutils.copy import xlutils class excelFunctions(): def getSheetNumber(self, fileName, sheetName): # opens existing workbook workbook = xlrd.open_workbook(fileName, on_demand=True) #turns sheet name into sheet number for index, sheet in enumerate(workbook.sheet_names()): if sheet == sheetName: return index def createSheet(self, fileName, sheetName): # open existing workbook rb = xlrd.open_workbook(fileName, formatting_info=True, on_demand=True) # make a copy of it wb = xl_copy(rb) # creates a variable called sheets which stores all the sheet names sheets = rb.sheet_names() # creates a string which is equal to the sheetName user input str1 = sheetName # checks to see if the given sheetName is a current sheet if (str1 not in sheets): # add sheet to workbook with existing sheets Sheet = wb.add_sheet(sheetName) # save the sheet with the same file name as before wb.save(fileName) else: # this declares the sheet variable to be equal to the sheet name the user gives sheet = wb.get_sheet(self.getSheetNumber(fileName, sheetName)) # save the sheet with the same file name as before wb.save(fileName) def createCopy(self, fileName, fileName2, sheetName, sheetName2): # open existing workbook rb = xlrd.open_workbook(fileName, formatting_info=True) # defines sheet as the name of the sheet given sheet = rb.sheet_by_name(sheetName) # makes a copy of the original sheet wb = xl_copy(rb) # creates an int called column_count which is equal to the sheets maximum columns column_count = sheet.ncols - 1 # creates a blank array called stuff Stuff = [] # this loops through adding columns from the given sheet name for i in range (0, column_count): Stuff.append([sheet.cell_value(row, i) for row in range(sheet.nrows)]) # create a sheet if there is not already a sheet self.createSheet(fileName, sheetName2) # defines sheet as the new sheet sheet = wb.get_sheet(self.getSheetNumber(fileName, sheetName2)) # this writes to the sheet for colidx, col in enumerate(Stuff): for rowidx, row in enumerate(col): sheet.write(rowidx, colidx, row) # this saves the file wb.save(fileName2)