将Python Selenium输出写入Excel

我写了一个脚本来从在线网站上刮取产品信息。 目标是将这些信息写入Excel文件。 由于我的Python知识有限,我只知道如何使用Powershell中的Out-file导出。 但结果是每个产品的信息都是分开打印的。 我宁愿每个产品都有一行。

我想要的输出可以在图片中看到。 我更喜欢我的输出看起来像第二个版本,但我可以住在第一个。

在这里输入图像说明


这是我的代码:

from selenium import webdriver from selenium.common.exceptions import TimeoutException from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import NoSuchElementException url = "http://www.strem.com/" cas = ['16940-92-4','29796-57-4','13569-57-8','15635-87-7'] for i in cas: driver = webdriver.Firefox() driver.get(url) driver.find_element_by_id("selectbox_input").click() driver.find_element_by_id("selectbox_input_cas").click() inputElement = driver.find_element_by_name("keyword") inputElement.send_keys(i) inputElement.submit() # Check if a particular element exists; returns True/False def check_exists_by_xpath(xpath): try: driver.find_element_by_xpath(xpath) except NoSuchElementException: return False return True xpath1 = ".//div[@class = 'error']" # element containing error message xpath2 = ".//table[@class = 'product_list tiles']" # element containing table to select product from #xpath3 = ".//div[@class = 'catalog_number']" # when selection is needed, returns the first catalog number if check_exists_by_xpath(xpath1): print "cas# %s is not found on Strem." %i driver.quit() else: if check_exists_by_xpath(xpath2): catNum = driver.find_element_by_xpath(".//div[@class = 'catalog_number']") catNum.click() country = driver.find_element_by_name("country") for option in country.find_elements_by_tag_name('option'): if option.text == "USA": option.click() country.submit() name = driver.find_element_by_id("header_description").text prodNum = driver.find_element_by_class_name("catalog_number").text print(i) print(name.encode("utf-8")) print(prodNum) skus_by_xpath = WebDriverWait(driver, 10).until( lambda driver : driver.find_elements_by_xpath(".//td[@class='size']") ) for output in skus_by_xpath: print(output.text) prices_by_xpath = WebDriverWait(driver, 10).until( lambda driver : driver.find_elements_by_xpath(".//td[@class='price']") ) for result in prices_by_xpath: print(result.text[3:]) #To remove last three characters, use :-3 driver.quit() else: country = driver.find_element_by_name("country") for option in country.find_elements_by_tag_name('option'): if option.text == "USA": option.click() country.submit() name = driver.find_element_by_id("header_description").text prodNum = driver.find_element_by_class_name("catalog_number").text print(i) print(name.encode("utf-8")) print(prodNum) skus_by_xpath = WebDriverWait(driver, 10).until( lambda driver : driver.find_elements_by_xpath(".//td[@class='size']") ) for output in skus_by_xpath: print(output.text) prices_by_xpath = WebDriverWait(driver, 10).until( lambda driver : driver.find_elements_by_xpath(".//td[@class='price']") ) for result in prices_by_xpath: print(result.text[3:]) #To remove last three characters, use :-3 driver.quit() 

https://pythonhosted.org/openpyxl/tutorial.html

这是一个python库的教程,允许操作python还有其他库,但我喜欢使用这个。

从openpyxl导入工作簿wb = Workbook()

然后使用给出的方法来写你的数据,然后

wb.save(文件名)

真的很容易上手。

这是一个使用xlwt和xlrd的pdf教程,但是我并没有真正使用这些模块。 http://www.simplistix.co.uk/presentations/python-excel.pdf

我通常发现写入CSV是将数据导入到excel中最安全的方法。 我使用类似下面的代码:

 import csv import sys import time import datetime from os import fsync ts=time.time() #get the time, to use in a filename ds=datetime.datetime.fromtimestamp(ts).strftime('%Y%m%d%H%M') #format the time for the filename f2=open('OutputLog_'+ds+'.txt','w') #my file is output_log + the date time stamp f2.write(str('Column1DataPoint'+','+'Column2DataPoint') #write your text, separate your data with comma's #if you're running a long loop, and want to keep your file up to date with the proces do these two steps in your loop too f2.flush() fsync(f2.fileno()) #once the loop is finished and data is writtin, close your file f2.close() 

我认为对你来说,对上面的代码的改变是改变写行如下所示:

 f2.write(str(i+','+name.encode("utf-8")+','+prodNum+','+output.text)