Python Pandas 将多列零替换为Nan

Python Pandas replace multiple columns zero to Nan

载入 pandas 数据框 df2 的人员属性列表。对于清理,我想用 np.nan 替换值零(0'0')。

df2.dtypes

ID                   object
Name                 object
Weight              float64
Height              float64
BootSize             object
SuitSize             object
Type                 object
dtype: object

将值零设置为 np.nan 的工作代码:

df2.loc[df2['Weight'] == 0,'Weight'] = np.nan
df2.loc[df2['Height'] == 0,'Height'] = np.nan
df2.loc[df2['BootSize'] == '0','BootSize'] = np.nan
df2.loc[df2['SuitSize'] == '0','SuitSize'] = np.nan

相信这可以通过 similar/shorter 方式完成:

df2[["Weight","Height","BootSize","SuitSize"]].astype(str).replace('0',np.nan)

但是上面的方法不起作用。零保留在 df2 中。如何解决这个问题?

我认为你需要 replace by dict:

cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})
data['amount']=data['amount'].replace(0, np.nan)
data['duration']=data['duration'].replace(0, np.nan)

您可以使用 'replace' 方法并将列表中要替换的值作为第一个参数传递,并将所需的值作为第二个参数传递:

cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace(['0', 0], np.nan)

尝试:

df2.replace(to_replace={
             'Weight':{0:np.nan}, 
             'Height':{0:np.nan},
             'BootSize':{'0':np.nan},
             'SuitSize':{'0':np.nan},
                 })

另一种替代方式:

cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].mask(df2[cols].eq(0) | df2[cols].eq('0'))

在“年龄”列中,将零替换为空格

df['age'].replace(['0', 0'], '', inplace=True)

将单列的零替换为 nan

df['age'] = df['age'].replace(0, np.nan)

用 nan 替换多列的零

cols = ["Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI"]

df[cols] = df[cols].replace(['0', 0], np.nan)

用数据框的 nan 替换零

df.replace(0, np.nan, inplace=True)

如果您只想替换整个数据框中的零,您可以直接替换它们而无需指定任何列:

df = df.replace({0:pd.NA})