与外部列表中的元素相比,如何仅更改 Pandas' 系列值?

How to only change Pandas' series value when compared to elements from external list?

我有一个参考词列表要保留在"columnName"列中,如果值不属于list_excluded的元素,则将值替换为[=16] =].以下是我的想法:

list_excluded = ['egWord1', 'egWord2']

df['new'] = df['old']

# I only want to change values in 'new' column to 'other' if the value is not 'egWord1' or 'egWord2'
df.loc[df['new'] == 'other', df['columnName']] = list_excluded

您可以使用 apply(),例如:

代码:

df['new'] = df['old'].apply(lambda x: 'other' if x in list_excluded else x)

测试代码:

list_excluded = ['egWord1', 'egWord2']

df = pd.DataFrame(
    ['egWord1', 'egWord2', 'XegWord1', 'YegWord2'], columns=['old'])

df['new'] = df['old'].apply(lambda x: 'other' if x in list_excluded else x)

print(df)

结果:

        old       new
0   egWord1     other
1   egWord2     other
2  XegWord1  XegWord1
3  YegWord2  YegWord2

如果没有任何示例数据或所需的输出,很难做到这一点,但听起来您正在尝试 select 不在列表中的值,然后将 df['new'] 设置为 'other'。那是对的吗?如果是这样,试试这个:

df.loc[~df['columnName'].isin(list_excluded), df['new']] = 'other'

这是假设您已经完成查找(另一个答案将两个步骤合二为一)。

稍微快一点的解决方案:

df['new'] = np.where(~df.old.isin(list_excluded), 'other', df.old)