Python – 从Excel文件读取时间不正确的date时间

我有一个Excel文件有3列作为date时间或date或时间字段。 我正在阅读它通过xlrd包,我得到的时间milliseconds我想,当我xlrd它转换回date时间我得到错误的结果。

我尝试将文件转换为csv 。 这也没有帮助,我得到奇怪的date时间格式,我不能理解。

这是我用xlrd格式尝试的。 我更喜欢使用.xlrs扩展名的文件作为input,否则我必须每次获得新的input文件时将excel文件转换为.csv

 from xlrd import open_workbook import os,pickle,datetime def main(path, filename, absolute_path_organisation_structure): absolute_filepath = os.path.join(path,filename) wb = open_workbook(absolute_filepath) for sheet in wb.sheets(): number_of_rows = sheet.nrows number_of_columns = sheet.ncols for row_index in xrange(1, sheet.nrows): row=[] for col_index in xrange(4,7): #4th and 6th columns are date fields row.append(sheet.cell(row_index, col_index).value) print(row) #Relevant list formed with 4th, 5th and 6th columns print(datetime.datetime.fromtimestamp(float(row[0])).strftime('%Y-%m-%d %H:%M:%S')) path = "C:\\Users\\***************\\NEW DATA" MISfile = "P2P_2015 - Copy.xlsx" absolute_path_organisation_structure = "C:\\Users\\******************NEW DATA\\organisation.csv" main(path, MISfile, absolute_path_organisation_structure) 

结果:

 [42011.46789351852, u'Registered', 42009.0] 1970-01-01 17:10:11 [42011.46789351852, u'Sent for CTG1 approval', 42010.0] 1970-01-01 17:10:11 [42011.46789351852, u'Sent back', 42010.0] 1970-01-01 17:10:11 [42011.46789351852, u'Registered', 42011.0] 1970-01-01 17:10:11 [42011.46789351852, u'Sent for CTG1 approval', 42011.0] 1970-01-01 17:10:11 [42011.46789351852, u'Sent for CTG2 approval', 42012.0] 1970-01-01 17:10:11 [42011.46789351852, u'CTG2 Approved', 42012.0] 1970-01-01 17:10:11 [42011.46789351852, u'Sent back', 42013.0] 1970-01-01 17:10:11 [42170.61667824074, u'Registered', 42144.0] 1970-01-01 17:12:50 [42170.61667824074, u'Registered', 42144.0] 1970-01-01 17:12:50 [42170.61667824074, u'Sent back', 42165.0] 1970-01-01 17:12:50 [42170.61667824074, u'Sent back', 42165.0] 1970-01-01 17:12:50 [42170.61667824074, u'Registered', 42170.0] 1970-01-01 17:12:50 [42170.61667824074, u'Registered', 42170.0] 1970-01-01 17:12:50 

实际input文件:(从Excel复制)

 1/7/2015 11:13 Registered 1/5/2015 0:00 1/7/2015 11:13 Sent for CTG1 approval 1/6/2015 0:00 1/7/2015 11:13 Sent back 1/6/2015 0:00 1/7/2015 11:13 Registered 1/7/2015 0:00 1/7/2015 11:13 Sent for CTG1 approval 1/7/2015 0:00 1/7/2015 11:13 Sent for CTG2 approval 1/8/2015 0:00 1/7/2015 11:13 CTG2 Approved 1/8/2015 0:00 1/7/2015 11:13 Sent back 1/9/2015 0:00 6/15/2015 14:48 Registered 5/20/2015 0:00 6/15/2015 14:48 Registered 5/20/2015 0:00 6/15/2015 14:48 Sent back 6/10/2015 0:00 6/15/2015 14:48 Sent back 6/10/2015 0:00 6/15/2015 14:48 Registered 6/15/2015 0:00 6/15/2015 14:48 Registered 6/15/2015 0:00 

为什么我无法正确读取date? 为什么他们不是简单地串起来,以便我可以轻松地转换它们?

xldate_as_tuple(xldate,datemode)[#]

将Excel数字(推定为表示date,date时间或时间)转换为适合提供给datetime或mx.DateTime构造函数的元组。

资料来源: http : //www.lexicon.net/sjmachin/xlrd.html#xlrd.xldate_as_tuplefunction

用法示例: 如何使用“xlrd.xldate_as_tuple()“

问题在于,如果将Exceldate时间值解释为UNIX时间戳,则不同。 要查找的警告标志是结果值都在UNIX纪元( 1970-01-01 )附近。

您可以使用本答复中描述的方法将Exceldate时间转换为UNIX。

Windows / Mac Excel 2011

 Unix Timestamp = (Excel Timestamp - 25569) * 86400 

Mac Excel 2007

 Unix Timestamp = (Excel Timestamp - 24107) * 86400 

如果您应用此转换,您应该得到正确的输出:

 timestamp = (float(row[0]) - 25569) * 86400 datetime.datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S') 

如果要读取的Excel文件是一张表,可以简单直接地使用pandas.read_excel 。 用pandas.to_datetime转换date之后

 from __future__ import absolute_import, division, print_function import os import pandas as pd def main(path, filename, absolute_path_organisation_structure): absolute_filepath = os.path.join(path,filename) #Relevant list formed with 4th, 5th and 6th columns df = pd.read_excel(absolute_filepath, header=None, parse_cols=[4,5,6]) # Transform column 0 and 2 to datetime df[0] = pd.to_datetime(df[0]) df[2] = pd.to_datetime(df[2]) print(df) path = "C:\\Users\\***************\\NEW DATA" MISfile = "P2P_2015 - Copy.xlsx" main(path, MISfile,None)