根据唯一值过滤列值,但对同一列的不同值不重复相同的唯一值

Filtering column value based on unique value, but not repeated for different value of the same column on the same unique value

寻找方法来过滤具有 inactive 状态的唯一值,但不会在相同的唯一值下重复为 active 状态。

df:

Unique_value    Status
1               Active        <- Has both active and inactive, must be inactive only
1               Active        <- Has both active and inactive, must be inactive only
1               Inactive      <- Has both active and inactive, must be inactive only
1               Inactive      <- Has both active and inactive, must be inactive only
2               Inactive      <- Has inactive only
2               Inactive      <- Has inactive only
2               Inactive      <- Has inactive only
3               Inactive      <- Has inactive only (cancelled okay to be filtered out)
3               Cancelled     <- Has inactive only (cancelled okay to be filtered out)
3               Inactive      <- Has inactive only (cancelled okay to be filtered out)

期望的输出:

Unique_value    status
2               Inactive
3               Inactive

到目前为止我已经尝试过了,但我认为这是不正确的。

p = ['Inactive', 'Active']
df.groupby('Unique_value')['Status'].apply(lambda x: (x =='Inactive') != set(p))

让我们试试

g=df[df.groupby('Unique_value')['Status'].transform(lambda x: ~(x.eq('Active').any()))]

g[g['Status'].eq('Inactive')].drop_duplicates()

首先检查每组中的 any 个值是否为 ActiveInactive。然后去掉两个条件都为真的组:

m1 = df["Status"].eq("Active").groupby(df["Unique_value"]).transform("any")
m2 = df["Status"].eq("Inactive").groupby(df["Unique_value"]).transform("any")
df[~(m1 & m2)].groupby("Unique_value", as_index=False).first()

   Unique_value    Status
0             2  Inactive
1             3  Inactive