Pandas: groupby 然后根据条件计数给出 NaN
Pandas: groupby then count based on condition gives NaN
我有以下数据集:
+----+------+
| ID | Type |
+----+------+
| a | New |
+----+------+
| b | Old |
+----+------+
| b | Old |
+----+------+
| b | New |
+----+------+
| c | Old |
+----+------+
我正在尝试按 ID 分组,然后计算每个组的 New
出现次数。例如,我会有 a=1
、b=2
和 c=0
.
这是我尝试过的方法:
df['NewAmount'] = df.groupby('ID')['Type'].apply(
lambda x: x[x == 'New'].count())
我明白了:
+----+------+----------+
| ID | Type | NewAmount|
+----+------+----------+
| a | New | NaN |
+----+------+----------+
| b | Old | NaN |
+----+------+----------+
| b | Old | NaN |
+----+------+----------+
| b | New | NaN |
+----+------+----------+
| c | Old | NaN |
+----+------+----------+
你应该试试 transform
df['out'] = df['Type'].eq('New').groupby(df['ID']).transform('sum')
我有以下数据集:
+----+------+
| ID | Type |
+----+------+
| a | New |
+----+------+
| b | Old |
+----+------+
| b | Old |
+----+------+
| b | New |
+----+------+
| c | Old |
+----+------+
我正在尝试按 ID 分组,然后计算每个组的 New
出现次数。例如,我会有 a=1
、b=2
和 c=0
.
这是我尝试过的方法:
df['NewAmount'] = df.groupby('ID')['Type'].apply(
lambda x: x[x == 'New'].count())
我明白了:
+----+------+----------+
| ID | Type | NewAmount|
+----+------+----------+
| a | New | NaN |
+----+------+----------+
| b | Old | NaN |
+----+------+----------+
| b | Old | NaN |
+----+------+----------+
| b | New | NaN |
+----+------+----------+
| c | Old | NaN |
+----+------+----------+
你应该试试 transform
df['out'] = df['Type'].eq('New').groupby(df['ID']).transform('sum')