CSV作家自己写

我正在尝试创build一个包含url列表的CSV文件。

我对编程相当陌生，所以请原谅任何草率的代码。

我有一个循环遍历地点列表来获取url列表。

然后我有一个循环内的数据导出到CSV文件。

import urllib, csv, re from BeautifulSoup import BeautifulSoup list_of_URLs = csv.reader(open("file_location_for_URLs_to_parse")) for row in list_of_URLs: row_string = "".join(row) file = urllib.urlopen(row_string) page_HTML = file.read() soup = BeautifulSoup(page_HTML) # parsing HTML Thumbnail_image = soup.findAll("div", {"class": "remositorythumbnail"}) Thumbnail_image_string = str(Thumbnail_image) soup_3 = BeautifulSoup(Thumbnail_image_string) Thumbnail_image_URL = soup_3.findAll('a', attrs={'href': re.compile("^http://")})

这是不适合我的部分：

  out = csv.writer(open("file_location", "wb"), delimiter=";") for tag in soup_3.findAll('a', href=True): out.writerow(tag['href'])

基本上，作家不断写作自己， 有没有办法跳到CSV的第一个空行下面，并开始写作？

你是在每次写入之后closures文件，还是在每次写入之前打开文件？只要检查一下。
另外，请尝试使用“ab”模式而不是“wb”。 “ab”将附加到文件。

不要把它放在任何循环中：

 out = csv.writer(open("file_location", "wb"), delimiter=";")

代替：

 with open("file_location", "wb") as fout: out = csv.writer(fout, delimiter=";") # put for-loop here

笔记：

open("file_location", "wb")创build一个新的文件，销毁任何同名的旧文件。这就是为什么它看起来像作家是覆盖旧线。
with open(...) as ...使用with open(...) as ...因为它会在with-block结束时自动closures文件。当文件closures时，这会明确。否则，文件保持打开状态（可能不会完全刷新），直到out被删除或重新分配为新值。这里不是真的是你的主要问题，但是用with太不用提了。

open("file_location", "wb")调用，您正在为每个URL执行一次，正在清除您之前对该文件执行的操作。将它移到您的for循环之外for以便只为所有url打开一次。

CSV作家自己写

Powershell – 脚本在阅读第一个Excel工作表后死机

sep =“;”语句会破坏由XSL生成的CSV文件中的utf8 BOM

用于循环检查没有空值的单元格和追加

python – 阅读文本，Excel，CSV文件到MS SQL服务器

CSV导出与互操作

Excel VBA：处理CSV文件数据的最佳方法

在Python中将xls转换为csv

强制Excel 2007默认显示秒数的时间数据

一次排列2列

在Python中将html转换为excel