以更好的方式使用多个条件过滤 pandas

Filtering pandas with multiple conditions in a better way

我想根据“SubtestID”列为新列“SubtestName”指定名称。

目前我的代码如下:

conditions = [(df4['subtestID'] == 325)|(df4['subtestID'] == 341)|(df4['subtestID'] == 1164)|(df4['subtestID'] == 1200),
              (df4['subtestID'] == 347)|(df4['subtestID'] == 357)|(df4['subtestID'] == 1308)|(df4['subtestID'] == 1330),
              (df4['subtestID'] == 328)|(df4['subtestID'] == 344)|(df4['subtestID'] == 1167)|(df4['subtestID'] == 1203)]

values = ["TestName1","TestName2","TestName3"]

df4['subTestName'] = np.select(conditions, values)

我想以更好的方式重写我的代码,而不是每次我想分配一个新 ID 时都重复“df4['subtestID']”。我打算再分配 30 个子测试名称。

我试过使用这种方式,但它给了我一个错误。

df4['subtestID'] in (325,341,1164,1200) 

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

我可以使用任何其他方法将名称分配给 SubtestID?

您可以简单地 isin() 如下:

df4['subtestID'].isin([325,341,1164,1200]) 

您可以尝试将代码更改为:

s_id = df4['subtestID']
conditions = [s_id.isin(325,341,1164,1200),
              s_id.isin(347,357,1308,1330),
              s_id.isin(328,344,1167,1203)]

values = ["TestName1","TestName2","TestName3"]
df4['subTestName'] = np.select(conditions, values,'other name')