Python pandas：按条件累计数据帧行数

Question

我有一个包含 2 列的数据框，格式如下：

Anna         15
Mary         14
Elizabeth    11
Margaret     10
Alice         6
Bertha        5
Helen         5
Emily         4
Maria         4
Marie         4
Catherine     4
Marion        4
Ellen         4
Florence      4
Augusta       4
...
Juliette      1
Mara          1
Elise         1
Alfrida       1
Nourelain     1
Margaretta    1
Manca         1
Aloisia       1
Hulda         1
Clear         1
Wendla        1
Ellis         1
Lulu          1
Juliet        1
Gertrude      1

如何使用 value < 5 累积行以获得类似

的内容

安娜 15 岁玛丽 14 伊丽莎白 11 玛格丽特 10 爱丽丝 6 伯莎 5 海伦 5 其他 50

Answer 1

这里有一个方法：

# create some random data
df =pd.DataFrame({'letter': list('qwertyuiopasdfghjklzxcvbnm'),'value': np.random.randint(1,15,26)})

定义一个函数，将值 < 5 的字母替换为其他字母：

def f(x):
    if x.value <5:
        l= 'other'
    else:
        l =x.letter
    return l

将函数应用于数据框：

df['letter'] =df.apply(f,axis=1)

按新字母列分组并求和：

df.groupby('letter').sum()

Python pandas：按条件累计数据帧行数

Python pandas: accumulate data frame rows by condition

python

pandas