XLRD / Python：使用for-loops将Excel文件读入dict

我正在阅读一个有15个字段和约2000行的Excel工作簿，并将每行转换为Python中的字典。然后我想把每个字典追加到列表中。我希望工作簿的首行中的每个字段都是每个字典中的一个键，并将相应的单元格值设置为字典中的值。我已经看过这里和这里的例子，但是我想要做一些有点不同的事情。第二个例子将工作，但我觉得这将是更有效的循环顶部行填充字典键，然后遍历每行获取值。我的Excel文件包含来自论坛的数据，看起来像这样（显然有更多的列）：

id thread_id forum_id post_time votes post_text 4 100 3 1377000566 1 'here is some text' 5 100 4 1289003444 0 'even more text here'

所以，我想字段id ， thread_id等是字典键。我希望我的字典看起来像：

 {id: 4, thread_id: 100, forum_id: 3, post_time: 1377000566, votes: 1, post_text: 'here is some text'}

最初，我有这样的代码遍历文件，但我的范围是错误的一些for循环，我生成太多字典的方式。这是我的初始代码：

 import xlrd from xlrd import open_workbook, cellname book = open('forum.xlsx', 'r') sheet = book.sheet_by_index(3) dict_list = [] for row_index in range(sheet.nrows): for col_index in range(sheet.ncols): d = {} # My intuition for the below for-loop is to take each cell in the top row of the # Excel sheet and add it as a key to the dictionary, and then pass the value of # current index in the above loops as the value to the dictionary. This isn't # working. for i in sheet.row(0): d[str(i)] = sheet.cell(row_index, col_index).value dlist.append(d)

任何帮助将不胜感激。提前致谢阅读。

这个想法是，首先将头部读入列表中。然后，迭代表格行（从下一个标题开始），根据标题键和适当的单元格值创build新的字典，并将其附加到字典列表中：

 from xlrd import open_workbook book = open_workbook('forum.xlsx') sheet = book.sheet_by_index(3) # read header values into the list keys = [sheet.cell(0, col_index).value for col_index in xrange(sheet.ncols)] dict_list = [] for row_index in xrange(1, sheet.nrows): d = {keys[col_index]: sheet.cell(row_index, col_index).value for col_index in xrange(sheet.ncols)} dict_list.append(d) print dict_list

对于包含以下内容的表

 ABCD 1 2 3 4 5 6 7 8

它打印：

 [{'A': 1.0, 'C': 3.0, 'B': 2.0, 'D': 4.0}, {'A': 5.0, 'C': 7.0, 'B': 6.0, 'D': 8.0}]

UPD（扩展字典理解）：

 d = {} for col_index in xrange(sheet.ncols): d[keys[col_index]] = sheet.cell(row_index, col_index).value

试试这个。下面的这个函数将返回包含每行和每列的字典的生成器。

 from xlrd import open_workbook for row in parse_xlsx(): print row # {id: 4, thread_id: 100, forum_id: 3, post_time: 1377000566, votes: 1, post_text: 'here is some text'} def parse_xlsx(): workbook = open_workbook('excelsheet.xlsx') sheets = workbook.sheet_names() active_sheet = workbook.sheet_by_name(sheets[0]) num_rows = active_sheet.nrows num_cols = active_sheet.ncols header = [active_sheet.cell_value(0, cell).lower() for cell in range(num_cols)] for row_idx in xrange(1, num_rows): row_cell = [active_sheet.cell_value(row_idx, col_idx) for col_idx in range(num_cols)] yield dict(zip(header, row_cell))

这个答案帮了我很多！我在摆弄大约两个小时的方法。然后，我发现这个优雅和简短的答案。谢谢！

我需要一些方法来使用键将xls转换为json。

所以我修改了上面的脚本，像这样的JSON打印语句：

 from xlrd import open_workbook import simplejson as json #http://stackoverflow.com/questions/23568409/xlrd-python-reading-excel-file-into-dict-with-for-loops?lq=1 book = open_workbook('makelijk-bomen-herkennen-schors.xls') sheet = book.sheet_by_index(0) # read header values into the list keys = [sheet.cell(0, col_index).value for col_index in xrange(sheet.ncols)] print "keys are", keys dict_list = [] for row_index in xrange(1, sheet.nrows): d = {keys[col_index]: sheet.cell(row_index, col_index).value for col_index in xrange(sheet.ncols)} dict_list.append(d) #print dict_list j = json.dumps(dict_list) # Write to file with open('data.json', 'w') as f: f.write(j)

尝试首先parsing第一行，所有列，另一个parsing数据的函数，然后按顺序调用它们来设置您的密钥。

 all_fields_list = [] header_dict = {} def parse_data_headers(sheet): global header_dict for c in range(sheet.ncols): key = sheet.cell(1, c) #here 1 is the row number where your header is header_dict[c] = key #store it somewhere, here I have chosen to store in a dict def parse_data(sheet): for r in range(2, sheet.nrows): row_dict = {} for c in range(sheet.ncols): value = sheet.cell(r,c) row_dict[c] = value all_fields_list.append(row_dict)

这个脚本允许你将excel数据转换成字典列表

 import xlrd workbook = xlrd.open_workbook('forum.xls') workbook = xlrd.open_workbook('forum.xls', on_demand = True) worksheet = workbook.sheet_by_index(0) first_row = [] # The row where we stock the name of the column for col in range(worksheet.ncols): first_row.append( worksheet.cell_value(0,col) ) # tronsform the workbook to a list of dictionnary data =[] for row in range(1, worksheet.nrows): elm = {} for col in range(worksheet.ncols): elm[first_row[col]]=worksheet.cell_value(row,col) data.append(elm) print data

 from xlrd import open_workbook dict_list = [] book = open_workbook('forum.xlsx') sheet = book.sheet_by_index(3) # read first row for keys keys = sheet.row_values(0) # read the rest rows for values values = [sheet.row_values(i) for i in range(1, sheet.nrows)] for value in values: dict_list.append(dict(zip(keys, value))) print dict_list

XLRD / Python：使用for-loops将Excel文件读入dict

如何写excel文件（行和列）和unicode字符的单词？使用Java程序

将电子表格的列存储在Python字典中

在Openpyxl中使用嵌套字典创build一个列表

使用python和xlrd，从电子表格中读取2列的最佳方法是什么？