如何创建 table 以根据不同的列值进行匹配？如果语句？

Question

我有一个数据集，我想看看是否有一种方法可以根据列值匹配数据。

  col-A    col-B      
  Apple    squash     
  Apple    lettuce    
  Banana   Carrot     
  Banana   Carrot 
  Banana   Carrot
  dragon   turnip 
  melon    potato
  melon    potato
  pear     potato

匹配

如果 col A 匹配另一个 col a 而 col b 不匹配
如果 col B 匹配另一个 col B 而 col a 不匹配

col-A     col-B      
Apple    squash     
Apple    lettuce 
melon    potato
melon    potato
 pear    potato

修改修正错字

edit2 修复了第二个错字

Answer 1

IIUC，您需要计算两个掩码以确定哪个组与其他值具有唯一匹配：

m1 = df.groupby('col-B')['col-A'].transform('nunique').gt(1)
m2 = df.groupby('col-A')['col-B'].transform('nunique').gt(1)

out = df[m1|m2]

输出：

   col-A    col-B
0  Apple   squash
1  Apple  lettuce
6  melon   potato
7  melon   potato
8   pear   potato

您还可以获得 unique/exclusive 对：

df[~(m1|m2)]

    col-A    col-B
2  Banana   Carrot
3  Banana   Carrot
4  Banana   Carrot
5    Pear  Cabbage

Answer 2

所以，如果我理解得很好，你想 select 每一行，这样分组 colA (resp.colB) 然后 colB (resp.colA) 导致不止一组。

我可以建议:

grA = df2.groupby("colA").filter(lambda x : x.groupby("colB").ngroups > 1)
grB = df2.groupby("colB").filter(lambda x : x.groupby("colA").ngroups > 1)

通往：

grA
    colA     colB
0  Apple   squash
1  Apple  lettuce

和

grB
    colA    colB
6  melon  potato
7  melon  potato
8   pear  potato

合并两个数据帧将导致所需的输出。

如何创建 table 以根据不同的列值进行匹配？如果语句？

How do I create a table to match based on different columns values ? If statements?

python

if-statement

filter

dataframe

pandas