通过忽略区分大小写的方式删除重复项，并将下一列值添加到 pandas 数据框中的第一个值 python

Question

我有df,

Name    Count
Ram     1
ram     2
raM     1
Arjun   3
arjun   4

我想要的输出 df，

Name    Count
Ram     4
Arjun   7

我尝试了 groupby 但我无法获得所需的输出，请帮助

Answer 1

使用 agg by values of Names converted to lower - first 和 sum:

df = (df.groupby(df['Name'].str.lower(), as_index=False, sort=False)
        .agg({'Name':'first', 'Count':'sum'}))
print (df)
    Name  Count
0    Ram      4
1  Arjun      7

详情：

print (df['Name'].str.lower())
0      ram
1      ram
2      ram
3    arjun
4    arjun
Name: Name, dtype: object

Answer 2

In [71]: df.assign(Name=df['Name'].str.capitalize()).groupby('Name', as_index=False).sum()
Out[71]:
    Name  Count
0  Arjun      7
1    Ram      4

Answer 3

如果我按 title 格式的字符串分组，它会简化我必须采取的步骤。

df.Count.groupby(df.Name.str.title()).sum().reset_index()

通过忽略区分大小写的方式删除重复项，并将下一列值添加到 pandas 数据框中的第一个值 python

Removing duplicates with ignoring case sensitive and adding the next column values with the first one in pandas dataframe in python

python

data-analysis

dataframe

pandas