用python / xlrd比较两张单独表格的excel数据

我有两个列表是从两个独立的Excel工作簿中提取的。每个元素包含两个自己的元素。这些列表表示在每个Excel工作簿的前两列中find的数据。例如：

search_terms = [['term1',300],['term2',400],['term3',200]...] #words searched on our website with number of hits for each item_description = [[900001,'a string with term1'],[900002,'a string with term 2'],[900003,'a string with term 1 and 2']...] #item numbers with matching descriptions

我的目标是将search_terms中的string与item_descriptions中的string进行比较，并针对每个search词从item_description中编译匹配项目编号的列表。然后，我想根据他们产生的命中数量来排列前250位的术语和匹配的项目编号。

我从xlrd生成了两个列表，我想我想转换为元组，并工作生成类似于以下的列表：

 results = [['term1',300,900001,900003],['term2',400,900002,900003],['term3',200]] #search term, number of hits, and matching item numbers based on item description

然后，我会将项目编号写入到相邻的列中，使用xlwt来显示/显示母体excel文件中的匹配项/命中。

一般来说，当我使用python，xlrd和编程的时候，我是绿草丛生的。对于我的方法，我非常感谢天真的任何意见和方向。

你在正确的轨道上，但我认为你想在这里是不是一个字典与术语作为关键和价值列表作为值。最终会看起来像这样：

 { 'term1': [300, 900001,900003], 'term2': [400,900002,900003], 'term3': [200] # are there numbers missing from this one? }

以下是这个代码的样子：

 import re from collections import defaultdict search_terms = [['term1',300],['term2',400],['term3',200]] #words searched on our website with number of hits for each item_description = [[900001,'a string with term1'],[900002,'a string with term2'],[900003,'a string with term1 and term2']] d = defaultdict(list) i = 0 for item in search_terms: d[item[0]].append(item[1]) rgx = re.compile(item[0]) for info in item_description: matches = re.findall(rgx, info[1]) if matches: d[item[0]].append(info[0]) print matches print d

Defaultdicttesting一个关键字是否已经存在于字典中，如果不存在，则将其添加进去。然后，您可以遍历字典，将它们放入第一列，然后遍历列表并将其放入自己的列中。让我知道如果这不符合您的数据结构，我可以尝试和适应它。

用python / xlrd比较两张单独表格的excel数据

Secant_it方法提供#VALUE！错误

如何在C＃中保存现有Excel文件中的更改而不保存文件？

从date列表中select平日和周末

HTML到Excel导出

使用Excel vba来自动计算某些单元格。小数点后2位的单元格式被覆盖

Excel使用多个不同区域格式的date/时间。 VB？公式？

excelmacros：ByRef参数types不匹配

如何在Excel中使用C＃代码禁用所有单元格的剪切/复制选项？

如何在Excel中使用由COUNTIF检查的单元格使用单元格文本

以.csv或.xls发送活动工作簿

用python / xlrd比较两张单独表格的excel数据

Secant_it方法提供#VALUE！ 错误

如何在C＃中保存现有Excel文件中的更改而不保存文件？

从date列表中select平日和周末

HTML到Excel导出

使用Excel vba来自动计算某些单元格。 小数点后2位的单元格式被覆盖

Excel使用多个不同区域格式的date/时间。 VB？ 公式？

excelmacros：ByRef参数types不匹配

如何在Excel中使用C＃代码禁用所有单元格的剪切/复制选项？

如何在Excel中使用由COUNTIF检查的单元格使用单元格文本

以.csv或.xls发送活动工作簿

Secant_it方法提供#VALUE！错误

使用Excel vba来自动计算某些单元格。小数点后2位的单元格式被覆盖

Excel使用多个不同区域格式的date/时间。 VB？公式？