查找适合组合条件的所有行

我正在寻找最好的方法来做到这一点使用python \ excel \ sql \谷歌表 – 我需要find适合从n值列表k值的所有行。

例如我有这个表叫做动物：

| Name | mammal | move | dive | +----------+--------+--------+-------+ | Giraffe | 1 | 1 | 0 | | Frog | 0 | 1 | 1 | | Dolphin | 1 | 1 | 1 | | Snail | 0 | 1 | 0 | | Bacteria | 0 | 0 | 0 |

我想写一个函数foo，其行为如下：

foo（布尔值的元组，最小匹配）

 foo((1,1,1),3) -> Dolphin foo((1,1,1),2) -> Giraffe, Dolphin, Frog foo((1,1,1),1) -> Giraffe, Dolphin, Frog, Snail foo((1,1,0),2) -> Giraffe, Dolphin foo((0,1,1),2) -> Dolphin, Frog foo((0,1,1),1) -> Giraffe, Dolphin, Frog, Snail foo((1,1,1),0) -> Giraffe, Dolphin, Frog, Snail, Bacteria

你最好的想法是什么？

这是一个纯Python 3解决scheme。

 data = [ ('Giraffe', 1, 1, 0), ('Frog', 0, 1, 1), ('Dolphin', 1, 1, 1), ('Snail', 0, 1, 0), ('Bacteria', 0, 0, 0), ] probes = [ ((1, 1, 1), 3), ((1, 1, 1), 2), ((1, 1, 1), 1), ((1, 1, 0), 2), ((0, 1, 1), 2), ((0, 1, 1), 1), ((1, 1, 1), 0), ] def foo(mask, minmatch): for name, *row in data: if sum(u & v for u, v in zip(mask, row)) >= minmatch: yield name for mask, minmatch in probes: print(mask, minmatch, *foo(mask, minmatch))

产量

 (1, 1, 1) 3 Dolphin (1, 1, 1) 2 Giraffe Frog Dolphin (1, 1, 1) 1 Giraffe Frog Dolphin Snail (1, 1, 0) 2 Giraffe Dolphin (0, 1, 1) 2 Frog Dolphin (0, 1, 1) 1 Giraffe Frog Dolphin Snail (1, 1, 1) 0 Giraffe Frog Dolphin Snail Bacteria

在Python 3.6.0上testing 它使用一些在旧版本中不可用的语法，但很容易使其适应旧语法。

这种变化在较旧版本的Python上运行。 testingPython 2.6.6。

 from __future__ import print_function data = [ ('Giraffe', 1, 1, 0), ('Frog', 0, 1, 1), ('Dolphin', 1, 1, 1), ('Snail', 0, 1, 0), ('Bacteria', 0, 0, 0), ] probes = [ ((1, 1, 1), 3), ((1, 1, 1), 2), ((1, 1, 1), 1), ((1, 1, 0), 2), ((0, 1, 1), 2), ((0, 1, 1), 1), ((1, 1, 1), 0), ] def foo(mask, minmatch): for row in data: if sum(u & v for u, v in zip(mask, row[1:])) >= minmatch: yield row[0] for mask, minmatch in probes: matches = list(foo(mask, minmatch)) print(mask, minmatch, matches)

产量

 (1, 1, 1) 3 ['Dolphin'] (1, 1, 1) 2 ['Giraffe', 'Frog', 'Dolphin'] (1, 1, 1) 1 ['Giraffe', 'Frog', 'Dolphin', 'Snail'] (1, 1, 0) 2 ['Giraffe', 'Dolphin'] (0, 1, 1) 2 ['Frog', 'Dolphin'] (0, 1, 1) 1 ['Giraffe', 'Frog', 'Dolphin', 'Snail'] (1, 1, 1) 0 ['Giraffe', 'Frog', 'Dolphin', 'Snail', 'Bacteria']

如果表是pandas数据框：

 def foo(df, val, n_match): results = [] for r in df.values: if sum(val & r[1:]) >= n_match: results.append(r[0]) print("foo(%s), %d -> %s") % (val, n_match, ' '.join(results))

我将尝试使用python和pandas

假设“名称”一栏是pandas指数：

 def foo(df, bool_index, minimum_matches): picked_column_index = [ idx for (idx, i) in enumerate(bool_index) if i] # select where "1" is picked_df = df.iloc[:, picked_column_index] #select column by location matched_row_bool = picked_df.sum(axis=1) >= minimum_matches return picked_df[matched_row_bool].index.tolist()

df是从表（动物）读取的pandasdataframe：

 df = pandas.read_csv('animials_csv_file_path')

要么

 df = pandas.read_excel('animials_xls_file_path')

它会返回一个包含匹配名称的列表

查找适合组合条件的所有行

如何概述3D公式中几列和制表符的范围？

将if＆vlookup组合公式转换为arrayformula – Google表格

Excel公式使用Concatenate的参数太less

从datestring（mm / dd / yyy）中select月份和年份excel

Google表格仅评估SUMIFS语句中数组中的第一个条件

vbaselect/删除除第一个以外的所有表单

如何从一个工作表中获取Google Spreadsheet中的数据行，并预填充另一个工作表

列记的推理：不知道如何访问“这个”的价值

Google脚本或VBA – inputinput值时插入行（Excel）

谷歌电子表格search价值，并在下一列find价值