过滤所有分组元素都为零的行

Question

我的 df:

id1 id2  uid . . . 
 1  100   0
 1  101 1000
 1  101 1000
 2  102   0
 2  103   0
 3  104 1002
 3  104 1002
 3  104 1002
 3  104   0
 3  105   0
 3  106   0
 4  107   0
 4  107   0
 4  108   0
 4  108   0

我想按 id1 分组并过滤掉所有 uid 都为零的 id1。

我尝试了以下方法：

df = df.groupby(by = 'id1').filter(lambda x: x['uid'].sum() > 0).reset_index(drop = True)

但问题在于它对非零 uid 求和并通过这样做创建随机 uid。

想要的结果：

id1 id2  uid . . . 
 1  100   0
 1  101 1000
 1  101 1000
 3  104 1002
 3  104 1002
 3  104 1002
 3  104   0
 3  105   0
 3  106   0

Answer 1

您可能会发现“uid”不等于 0；然后 select 通过转换 max 相应的“id1”，如果对于“id1”，相应的“uid”集合包含非 0（例如，“ id1"=4 被删除):

out = df[df['uid'].ne(0).groupby(df['id1']).transform('max')]

输出：

    id1  id2   uid
0     1  100     0
1     1  101  1000
2     1  101  1000
5     3  104  1002
6     3  104  1002
7     3  104  1002
8     3  104     0
9     3  105     0
10    3  106     0

过滤所有分组元素都为零的行

Filter our rows where all the grouped elements are equal to zero

python

filter

dataframe

pandas

pandas-groupby