在导出到Excel时,将类“pandas.tslib.Timedelta”转换为string
初始数据框:
arrivalTime 0 2016-01-12 06:35:42 2 2016-01-12 06:54:02 3 2016-01-12 07:01:43 4 2016-01-12 07:02:28 5 2016-01-12 07:12:29 6 2016-01-12 07:18:41
在数据上,我应用这个function:
def function(df): df['arrivalTime_cal'] = pd.to_datetime(df['arrivalTime'], format='%Y-%m-%d %H:%M:%S') df['diff_time'] = df['arrivalTime_cal'].diff().fillna(0) del df['arrivalTime_cal'] return df
我得到这些结果(更正在ipython中):
diff_time 0 00:00:00 1 00:04:37 2 00:13:43 3 00:07:41 4 00:00:45
导出为excel时导致转换格式:
arrivalTime diff_time 0 2016-01-12 06:35:42 0 1 2016-01-12 06:40:19 0,003206019 2 2016-01-12 06:54:02 0,009525463 3 2016-01-12 07:01:43 0,005335648 4 2016-01-12 07:02:28 0,000520833
如何在Excel中保留string格式?
先谢谢你
IIUC那么你可以将这个types转换为str
,然后再split
str:
In [53]: df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.').str[0] Out[53]: index 0 00:00:00 2 00:18:20 3 00:07:41 4 00:00:45 5 00:10:01 6 00:06:12 dtype: object
将上面的步骤分解成步骤,使用astype
转换为str
:
In [54]: df['diff_time'].astype(str) Out[54]: index 0 0 days 00:00:00.000000000 2 0 days 00:18:20.000000000 3 0 days 00:07:41.000000000 4 0 days 00:00:45.000000000 5 0 days 00:10:01.000000000 6 0 days 00:06:12.000000000 Name: diff_time, dtype: object
现在分割(默认字符将是空格),并采取只是最后一个拆分元素是时间组件:
In [55]: df['diff_time'].astype(str).str.split().str[-1] Out[55]: index 0 00:00:00.000000000 2 00:18:20.000000000 3 00:07:41.000000000 4 00:00:45.000000000 5 00:10:01.000000000 6 00:06:12.000000000 dtype: object
现在rsplit
并采取时间减去微秒
In [56]: df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.') Out[56]: index 0 [00:00:00, 000000000] 2 [00:18:20, 000000000] 3 [00:07:41, 000000000] 4 [00:00:45, 000000000] 5 [00:10:01, 000000000] 6 [00:06:12, 000000000] dtype: object
你可以看到转换后的值确实是str
:
In [57]: df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.').str[0][0] Out[57]: '00:00:00'