pandas过滤多列单标准

我有一个超过一百列的Excel表。 我需要筛选其中五个以查看哪些列在其中一个单元格中为“否”。 有没有办法使用单个search条件来筛选多个列:

no_invoice_filter = df[(df['M1: PL - INVOICED']) & (df['M2: EX - INVOICED']) & (df['M3: TEST DEP - INVOICED']) == 'No'] 

如果每栏都等于“否”,分别写出来,

上面的代码错误:

 TypeError: unsupported operand type(s) for &: 'str' and 'bool' 

你可以做:

 df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')] 

所以,你基本上通过一列感兴趣的列,并比较这些列与你的标量值,如果你在任何地方出现“否”,然后使用any(axis=1)

 In [115]: df = pd.DataFrame({'a':'no', 'b':'yes', 'c':['yes','no','yes','no','no']}) df Out[115]: abc 0 no yes yes 1 no yes no 2 no yes yes 3 no yes no 4 no yes no 

使用any(axis=1)然后返回所有行,其中No出现在任何感兴趣的列中:

 In [133]: df[(df[['a','c']] == 'no').any(axis=1)] Out[133]: abc 0 no yes yes 1 no yes no 2 no yes yes 3 no yes no 4 no yes no 

您也可以使用掩码为使用dropna的特定列删除NaN行

 In [132]: df[df[['a','c']] == 'no'].dropna(subset=['c']) Out[132]: abc 1 no NaN no 3 no NaN no 4 no NaN no 

您需要在any列中使用至less一个No的列的子集:

 df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No') .any(axis=1)] 

样品:

 df = pd.DataFrame({'M1: PL - INVOICED':['a','Yes','No'], 'M2: EX - INVOICED':['Yes','No','b'], 'M3: TEST DEP - INVOICED':['s','a','No']}) print (df) M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED 0 a Yes s 1 Yes No a 2 No b No print ((df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')) M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED 0 False False False 1 False True False 2 True False True print ((df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No') .any(axis=1)) 0 False 1 True 2 True dtype: bool print (df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No') .any(1)]) M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED 1 Yes No a 2 No b No