用python和pandas和多个索引读取excel文件

我是一个Python新手所以请原谅这个基本的问题。 我的.xlsx文件看起来像这样

Unnamend:1 A Unnamend:2 B 2015-01-01 10 2015-01-01 10 2015-01-02 20 2015-01-01 20 2015-01-03 30 NaT NaN 

当我使用pandas.read_excel(…)在Python中读取它时,pandas会自动使用第一列作为时间索引。

是否有一句话告诉大pandas注意到,每一列都是属于时间序列的时间索引?

所需的输出将如下所示:

 date AB 2015-01-01 10 10 2015-01-02 20 20 2015-01-03 30 NaN 

为了parsing相邻columns块并alignment它们各自的datetime索引,可以执行以下操作:

df开始:

 Int64Index: 3 entries, 0 to 2 Data columns (total 4 columns): Unnamed: 0 3 non-null datetime64[ns] A 3 non-null int64 Unnamed: 1 2 non-null datetime64[ns] B 2 non-null float64 dtypes: datetime64[ns](2), float64(1), int64(1) 

你可以迭代2列的块,并像这样在index merge

 def chunks(l, n): """ Yield successive n-sized chunks from l.""" for i in range(0, len(l), n): yield l[i:i + n] merged = df.loc[:, list(df)[:2]].set_index(list(df)[0]) for cols in chunks(list(df)[2:], 2): merged = merged.merge(df.loc[:, cols].set_index(cols[0]).dropna(), left_index=True, right_index=True, how='outer') 

要得到:

  AB 2015-01-01 10 10 2015-01-01 10 20 2015-01-02 20 NaN 2015-01-03 30 NaN 

pd.concat不幸的是不能工作,因为它不能处理重复的index条目,否则可以使用list comprehension pd.concat

 pd.concat([df.loc[:, cols].set_index(cols[0]) for cols in chunks(list(df), 2)], axis=1) 

我用xlrd导入数据后,我用pandas来显示

 import xlrd import pandas as pd workbook = xlrd.open_workbook(xls_name) workbook = xlrd.open_workbook(xls_name, encoding_override="cp1252") worksheet = workbook.sheet_by_index(0) first_row = [] # The row where we stock the name of the column for col in range(worksheet.ncols): first_row.append( worksheet.cell_value(0,col) ) data =[] for row in range(10, worksheet.nrows): elm = {} for col in range(worksheet.ncols): elm[first_row[col]]=worksheet.cell_value(row,col) data.append(elm) first_column=second_column=third_column=[] for elm in data : first_column.append(elm(first_row[0])) second_column.append(elm(first_row[1])) third_column.append(elm(first_row[2])) dict1={} dict1[first_row[0]]=first_column dict1[first_row[1]]=second_column dict1[first_row[2]]=third_column res=pd.DataFrame(dict1, columns=['column1', 'column2', 'column3']) print res