python – 如何处理“旧”的date,当数据传输到Excel
我有其中一列包含datestring的数据框。 我首先将它转换为datetime:
mydf['Desk Date'] = pd.to_datetime(mydf['Desk Date'])`
然后放下数据框来优化
Range('A1').value = mydf`
我得到以下错误:
Traceback (most recent call last): File "C:\Program Files (x86)\Python271\lib\site-packages\IPython\core\interactiveshell.py", line 3035, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-111-6c6f5ea1ff17>", line 1, in <module> Import.ImportFWD(test_path) File "C:\Users\jastrzem\Downloads\pyWFP\Import.py", line 42, in ImportFWD Range('A1').value = mydf File "C:\Program Files (x86)\Python271\lib\site-packages\xlwings\main.py", line 818, in value self.row1, self.col1, row2, col2), data) File "C:\Program Files (x86)\Python271\lib\site-packages\xlwings\_xlwindows.py", line 151, in set_value xl_range.Value = data File "C:\Program Files (x86)\Python271\lib\site-packages\win32com\client\dynamic.py", line 560, in __setattr__ self._oleobj_.Invoke(entry.dispid, 0, invoke_type, 0, value) com_error: (-2147352567, 'Exception occurred.', (0, None, None, None, 0, -2146827284), None)
其中一个date是Timestamp('1899-01-31 00:00:00')
,我认为这是错误的原因。
我尝试使用np.where
将2000年以前的所有值replace为NaN
,但没有运气。
f = lambda x: x.year mydf['Desk Date'] = np.where(pd.DataFrame(mydf['Desk Date']).applymap(f) > 2000, pd.to_datetime(mydf['Desk Date'], format='%D/%M/%Y'),np.nan)
我怎样才能修复上述命令,或者我应该如何处理“不可转让”的date以达到最佳效果?
谢谢!
[编辑]:我试图使用to_excel
方法,但没有运气。 我在函数结尾处添加的代码:
writer = pd.ExcelWriter('test7.xlsx', engine='xlsxwriter') mydf.to_excel(writer, sheet_name = 'Sheet1') writer.close()
它创build的文件,但它是空的。 我得到以下错误:
Traceback (most recent call last): File "C:\Program Files (x86)\Python271\lib\site-packages\IPython\core\interactiveshell.py", line 3035, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-26-6c6f5ea1ff17>", line 1, in <module> Import.ImportFWD(test_path) File "C:\Users\jastrzem\Downloads\pyWFP\Import.py", line 44, in ImportFWD writer.close() File "C:\Program Files (x86)\Python271\lib\site-packages\pandas\io\excel.py", line 623, in close return self.save() File "C:\Program Files (x86)\Python271\lib\site-packages\pandas\io\excel.py", line 1298, in save return self.book.close() File "C:\Program Files (x86)\Python271\lib\site-packages\xlsxwriter\workbook.py", line 295, in close self._store_workbook() File "C:\Program Files (x86)\Python271\lib\site-packages\xlsxwriter\workbook.py", line 518, in _store_workbook xml_files = packager._create_package() File "C:\Program Files (x86)\Python271\lib\site-packages\xlsxwriter\packager.py", line 140, in _create_package self._write_shared_strings_file() File "C:\Program Files (x86)\Python271\lib\site-packages\xlsxwriter\packager.py", line 280, in _write_shared_strings_file sst._assemble_xml_file() File "C:\Program Files (x86)\Python271\lib\site-packages\xlsxwriter\sharedstrings.py", line 53, in _assemble_xml_file self._write_sst_strings() File "C:\Program Files (x86)\Python271\lib\site-packages\xlsxwriter\sharedstrings.py", line 83, in _write_sst_strings self._write_si(string) File "C:\Program Files (x86)\Python271\lib\site-packages\xlsxwriter\sharedstrings.py", line 110, in _write_si self._xml_si_element(string, attributes) File "C:\Program Files (x86)\Python271\lib\site-packages\xlsxwriter\xmlwriter.py", line 122, in _xml_si_element self.fh.write("""<si><t%s>%s</t></si>""" % (attr, string)) File "C:\Program Files (x86)\Python271\lib\codecs.py", line 694, in write return self.writer.write(data) File "C:\Program Files (x86)\Python271\lib\codecs.py", line 357, in write data, consumed = self.encode(object, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0x94 in position 26: ordinal not in range(128)
这个错误并不是因为旧的date,而是因为你试图在一个单元格上抛出一个完整的dataframe。
而是使用to_excel
方法。
Excel将不会接受1900年之前的date。我的解决方法是用np.nanreplace“旧”date,因为我知道他们是数据错误无论如何。
mydf['Desk Date'] = pd.to_datetime(mydf['Desk Date']) dates_list = list(mydf['Desk Date']) dates_list = [x if x.year > 1900 else np.nan for x in dates_list ] mydf['Desk Date'] = dates_list