无法在Excel文件中正确写入提取的项目？

我已经写了一些代码在pythonparsing标题和链接从一个网页。最初，我试图parsing左侧栏中的链接，然后通过追踪每个链接来抓取每个页面上的上述文档。我完美无瑕地做到了这一点。我试图将不同页面的文档保存在一个excel文件中。但是，它创build了几个“表格”，从我的脚本的标题variables中提取所需的部分作为表格名称。我面临的问题是，当数据被保存时，只有链接中每个页面的最后一个logging保存在我的Excel表格中，而不是完整的logging。这是我尝试的脚本：

import requests from lxml import html from pyexcel_ods3 import save_data web_link = "http://www.wiseowl.co.uk/videos/" main_url = "http://www.wiseowl.co.uk" def get_links(page): response = requests.Session().get(page) tree = html.fromstring(response.text) data = {} titles = tree.xpath("//ul[@class='woMenuList']//li[@class='woMenuItem']/a/@href") for title in titles: if "author" not in title and "year" not in title: get_docs(data, main_url + title) def get_docs(data, url): response = requests.Session().get(url) tree = html.fromstring(response.text) heading = tree.findtext('.//h1[@class="gamma"]') for item in tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']"): title = item.findtext('.//a') link = item.xpath('.//a/@href')[0] # print(title, link) data.update({heading.split(" ")[-4]: [[(title)]]}) save_data("mth.ods", data) if __name__ == '__main__': get_links(web_link)

当你更新data字典中的值时，先前的值被replace。

你可以修复这个，如果你replace这一行：

 data.update({heading.split(" ")[-4]: [[(title)]]})

有了这个（它有点丑，但它的工作原理）：

 data[heading.split(" ")[-4]] = data.get(heading.split(" ")[-4], []) + [[(title)]]

或者，如果你希望它更可读：

 def get_docs(data, url): response = requests.Session().get(url) tree = html.fromstring(response.text) heading = tree.findtext('.//h1[@class="gamma"]') for item in tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']"): title = item.findtext('.//a') sheetname = heading.split(" ")[-4] if sheetname in data: data[sheetname].append([title]) else: data[sheetname] = [[title]] save_data("mth.ods", data)

编辑：要插入link到下一列，你应该简单地把它添加到你的列表中，像这样：

 if sheetname in data: data[sheetname].append([title, str(link)]) else: data[sheetname] = [[title, str(link)]]

编辑2：为了让它们在同一页面上，你需要把它们追加到同一个键上，因为键代表工作表，值代表save_data行和列。喜欢这个：

 sheetname = 'You are welcome' for item in tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']"): title = item.findtext('.//a') if sheetname in data: data[sheetname].append([title]) else: data[sheetname] = [[title]] save_data("mth.ods", data)

无法在Excel文件中正确写入提取的项目？

从一个bytebuffer生成一个excel文件

如何导出SSRS报告以优于分页符（不是单独的工作表）？

C＃ – 合并两个DataTable。任何解决scheme

使用数组值对列应用自动filter

DataGridView需要导出到一个excel文件

Excel – 间接公式不能与3d参考一起使用

excelstring比较问题

无法在iOS 7 SDK中打开.docx或.xlsx文件，仍然可以正常工作，如.doc，.xls，.rtf，.txt等文件格式

Excelmacros – 将逗号分隔的条目分割成新的行

Excel 2010 vba数组作为类成员错误

无法在Excel文件中正确写入提取的项目？

从一个bytebuffer生成一个excel文件

如何导出SSRS报告以优于分页符（不是单独的工作表）？

C＃ – 合并两个DataTable。 任何解决scheme

使用数组值对列应用自动filter

DataGridView需要导出到一个excel文件

Excel – 间接公式不能与3d参考一起使用

excelstring比较问题

无法在iOS 7 SDK中打开.docx或.xlsx文件，仍然可以正常工作，如.doc，.xls，.rtf，.txt等文件格式

Excelmacros – 将逗号分隔的条目分割成新的行

Excel 2010 vba数组作为类成员错误

C＃ – 合并两个DataTable。任何解决scheme