如何将带有NaN的合并Excel单元格读入Pandas DataFrame

我想读一个Excel工作表到Pandas DataFrame。 但是,有合并Excel单元格以及Null行(完整/部分NaN填充),如下所示。 为了澄清,John H.命令将“The Bodyguard”中的所有专辑都购买到“Red Pill Blues”。

Excel工作表捕获

当我将这张Excel表格读入Pandas DataFrame时,Excel数据无法正确传输。 pandas认为一个合并的细胞是一个细胞。 DataFrame看起来像下面这样:( 注意:()中的值是我想要的值)

数据帧捕获

请注意,最后一行不包含合并的单元格; 它只为Artist列提供一个值。


编辑:我没有尝试以下向前填写的NaN值:( pandas:阅读与合并单元格的Excel )

 df.index = pd.Series(df.index).fillna(method='ffill') 

但是, NaN值依然存在。 我可以使用什么策略或方法来正确填充DataFrame? 有没有一个pandas的方法,取消细胞和复制相应的内容?

您尝试引用的链接只需要转发填充索引列。 对于您的使用情况,您需要为所有数据fillnafillna 。 所以,简单地向前填充整个dataframe:

 df = pd.read_excel("Input.xlsx") print(df) # Order_ID Customer_name Album_Name Artist Quantity # 0 NaN NaN RadioShake NaN NaN # 1 1.0 John H. The Bodyguard Whitney Houston 2.0 # 2 NaN NaN Lemonade Beyonce 1.0 # 3 NaN NaN The Thrill Of It All Sam Smith 2.0 # 4 NaN NaN Thriller Michael Jackson 11.0 # 5 NaN NaN Divide Ed Sheeran 4.0 # 6 NaN NaN Reputation Taylor Swift 3.0 # 7 NaN NaN Red Pill Blues Maroon 5 5.0 df = df.fillna(method='ffill') print(df) # Order_ID Customer_name Album_Name Artist Quantity # 0 NaN NaN RadioShake NaN NaN # 1 1.0 John H. The Bodyguard Whitney Houston 2.0 # 2 1.0 John H. Lemonade Beyonce 1.0 # 3 1.0 John H. The Thrill Of It All Sam Smith 2.0 # 4 1.0 John H. Thriller Michael Jackson 11.0 # 5 1.0 John H. Divide Ed Sheeran 4.0 # 6 1.0 John H. Reputation Taylor Swift 3.0 # 7 1.0 John H. Red Pill Blues Maroon 5 5.0 

使用条件:

 import pandas as pd df_excel = pd.ExcelFile('Sales.xlsx') df = df_excel.parse('Info') for col in list(df): # All columns pprow = 0 prow = 1 for row in df[1:].iterrows(): # All rows, except first if pd.isnull(df.loc[prow, 'Album Name']): # If this cell is empty all in the same row too. continue elif pd.isnull(df.loc[prow, col]) and pd.isnull(df.loc[row[0], col]): # If a cell and next one are empty, take previous valor. df.loc[prow, col] = df.loc[pprow, col] pprow = prow prow = row[0] 

输出(我使用不同的名称):

  Order_ID Customer_name Album Name 0 NaN NaN Radio 1 1.0 John a 2 1.0 John b 3 1.0 John c 4 1.0 John d 5 1.0 John e 6 1.0 John f 7 NaN NaN GE 8 2.0 Harry We are Born 9 3.0 Lizzy Relapse 10 4.0 Abe Smoke 11 4.0 Abe Tell me 12 NaN NaN NaN 13 NaN NaN Best Buy 14 5.0 Kristy The wall 15 6.0 Sammy Kind of blue