Pandas 根据列项计数查找和替换

Pandas find and replace based on column items count

我有一个看起来像这样的数据框

import pandas as pd

all_data_set = [
        ('A','Area1','AA','A B D E','A B','D E'),
        ('B','Area1','AA','A B D E','A B','D E'),
        ('C','Area2','BB','C','C','C'),
        ('E','Area1','CC','A B D E','A B','D E'),
        ('F','Area3','BB','F G','G','F')
        ]

all_df = pd.DataFrame(data = all_data_set, columns = ['Name','Area','Type','Group','AA members','CC members'])

 Name   Area Type    Group AA members CC members
0    A  Area1   AA  A B D E        A B        D E
1    B  Area1   AA  A B D E        A B        D E
2    C  Area2   BB        C          C          C
3    E  Area1   CC  A B D E        A B        D E
4    F  Area3   BB      F G          G          F

最后一行(第 4 行)是正确的。 任何 BB 类型的东西都应该只在 Group AA members CC members

中有它自己 (F)

所以它应该是这样的:

4    F  Area3   BB        F          F          F

我试图做到这一点:

  1. 当类型为 BBGroup 的长度为 = 2 项时检查:

    df = (all_data_set.loc[(all_data_set['Type']== 'BB')]['Group'].str.split().str.len() == 2)

  2. 然后遍历每一行并找到这样的情况

  3. 用所有拖放行创建一个新的 Df 并使 Group , AA members, CC members = Name

  4. 删除 all_df

    中发生的行
  5. 合并 3. 回到 all_df

有更好的pandas方法吗?

尝试

# identify rows where Type is BB
m = all_df['Type'] == 'BB'
# for Type BB rows, replace Group, AA members and CC members values by Name
all_df.loc[m, ['Group', 'AA members', 'CC members']] = all_df.loc[m, 'Name']
print(all_df)
  Name   Area Type    Group AA members CC members
0    A  Area1   AA  A B D E        A B        D E
1    B  Area1   AA  A B D E        A B        D E
2    C  Area2   BB        C          C          C
3    E  Area1   CC  A B D E        A B        D E
4    F  Area3   BB        F          F          F

您可以尝试 iloc 和 for 循环。

for row in all_df.index:
    if all_df.iloc[row,2] == "BB":
        all_df.iloc[row,3:] = all_df["Name"][row]
        
all_df

  Name   Area Type    Group AA members CC members
0    A  Area1   AA  A B D E        A B        D E
1    B  Area1   AA  A B D E        A B        D E
2    C  Area2   BB        C          C          C
3    E  Area1   CC  A B D E        A B        D E
4    F  Area3   BB        F          F          F