pandas vlookup 两列并查找值
pandas vlookup of two columns and finding values
我有这样的数据框,
+-------+--------+
| A | B |
+-------+--------+
| David | Frank |
| Tim | David |
| Joe | Sam |
| Frank | Bob |
| Cathy | Tarun |
| | Rachel |
| | Tim |
+-------+--------+
现在,我想互相查找列并查找缺失值,
+-------+--------+-------------------+-------------------+
| A | B | C | D |
+-------+--------+-------------------+-------------------+
| David | Frank | Available on both | Available on both |
| Tim | David | Available on both | Available on both |
| Joe | Sam | in A not in B | in B not in A |
| Frank | Bob | Available on both | in B not in A |
| Cathy | Tarun | in A not in B | in B not in A |
| | Rachel | | in B not in A |
| | Tim | | Available on both |
+-------+--------+-------------------+-------------------+
您可以使用 numpy.select
with conditions created by isin
检查成员资格,使用 notnull
过滤缺失值:
print (df)
A B
0 David Frank
1 Tim David
2 Joe Sam
3 Frank Bob
4 Cathy Tarun
5 NaN Rachel
6 NaN Tim
df['C'] = np.select([df.A.isin(df.B), df.A.notnull()],
['Available on both', 'in A not in B'], default=None)
df['D'] = np.select([df.B.isin(df.A), df.B.notnull()],
['Available on both', 'in B not in A'], default=None)
print (df)
A B C D
0 David Frank Available on both Available on both
1 Tim David Available on both Available on both
2 Joe Sam in A not in B in B not in A
3 Frank Bob Available on both in B not in A
4 Cathy Tarun in A not in B in B not in A
5 NaN Rachel None in B not in A
6 NaN Tim None Available on both
我有这样的数据框,
+-------+--------+
| A | B |
+-------+--------+
| David | Frank |
| Tim | David |
| Joe | Sam |
| Frank | Bob |
| Cathy | Tarun |
| | Rachel |
| | Tim |
+-------+--------+
现在,我想互相查找列并查找缺失值,
+-------+--------+-------------------+-------------------+
| A | B | C | D |
+-------+--------+-------------------+-------------------+
| David | Frank | Available on both | Available on both |
| Tim | David | Available on both | Available on both |
| Joe | Sam | in A not in B | in B not in A |
| Frank | Bob | Available on both | in B not in A |
| Cathy | Tarun | in A not in B | in B not in A |
| | Rachel | | in B not in A |
| | Tim | | Available on both |
+-------+--------+-------------------+-------------------+
您可以使用 numpy.select
with conditions created by isin
检查成员资格,使用 notnull
过滤缺失值:
print (df)
A B
0 David Frank
1 Tim David
2 Joe Sam
3 Frank Bob
4 Cathy Tarun
5 NaN Rachel
6 NaN Tim
df['C'] = np.select([df.A.isin(df.B), df.A.notnull()],
['Available on both', 'in A not in B'], default=None)
df['D'] = np.select([df.B.isin(df.A), df.B.notnull()],
['Available on both', 'in B not in A'], default=None)
print (df)
A B C D
0 David Frank Available on both Available on both
1 Tim David Available on both Available on both
2 Joe Sam in A not in B in B not in A
3 Frank Bob Available on both in B not in A
4 Cathy Tarun in A not in B in B not in A
5 NaN Rachel None in B not in A
6 NaN Tim None Available on both