Python Pandas 将多列零替换为Nan
Python Pandas replace multiple columns zero to Nan
载入 pandas 数据框 df2
的人员属性列表。对于清理,我想用 np.nan
替换值零(0
或 '0'
)。
df2.dtypes
ID object
Name object
Weight float64
Height float64
BootSize object
SuitSize object
Type object
dtype: object
将值零设置为 np.nan
的工作代码:
df2.loc[df2['Weight'] == 0,'Weight'] = np.nan
df2.loc[df2['Height'] == 0,'Height'] = np.nan
df2.loc[df2['BootSize'] == '0','BootSize'] = np.nan
df2.loc[df2['SuitSize'] == '0','SuitSize'] = np.nan
相信这可以通过 similar/shorter 方式完成:
df2[["Weight","Height","BootSize","SuitSize"]].astype(str).replace('0',np.nan)
但是上面的方法不起作用。零保留在 df2 中。如何解决这个问题?
我认为你需要 replace
by dict
:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})
data['amount']=data['amount'].replace(0, np.nan)
data['duration']=data['duration'].replace(0, np.nan)
您可以使用 'replace' 方法并将列表中要替换的值作为第一个参数传递,并将所需的值作为第二个参数传递:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace(['0', 0], np.nan)
尝试:
df2.replace(to_replace={
'Weight':{0:np.nan},
'Height':{0:np.nan},
'BootSize':{'0':np.nan},
'SuitSize':{'0':np.nan},
})
另一种替代方式:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].mask(df2[cols].eq(0) | df2[cols].eq('0'))
在“年龄”列中,将零替换为空格
df['age'].replace(['0', 0'], '', inplace=True)
将单列的零替换为 nan
df['age'] = df['age'].replace(0, np.nan)
用 nan 替换多列的零
cols = ["Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI"]
df[cols] = df[cols].replace(['0', 0], np.nan)
用数据框的 nan 替换零
df.replace(0, np.nan, inplace=True)
如果您只想替换整个数据框中的零,您可以直接替换它们而无需指定任何列:
df = df.replace({0:pd.NA})
载入 pandas 数据框 df2
的人员属性列表。对于清理,我想用 np.nan
替换值零(0
或 '0'
)。
df2.dtypes
ID object
Name object
Weight float64
Height float64
BootSize object
SuitSize object
Type object
dtype: object
将值零设置为 np.nan
的工作代码:
df2.loc[df2['Weight'] == 0,'Weight'] = np.nan
df2.loc[df2['Height'] == 0,'Height'] = np.nan
df2.loc[df2['BootSize'] == '0','BootSize'] = np.nan
df2.loc[df2['SuitSize'] == '0','SuitSize'] = np.nan
相信这可以通过 similar/shorter 方式完成:
df2[["Weight","Height","BootSize","SuitSize"]].astype(str).replace('0',np.nan)
但是上面的方法不起作用。零保留在 df2 中。如何解决这个问题?
我认为你需要 replace
by dict
:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})
data['amount']=data['amount'].replace(0, np.nan)
data['duration']=data['duration'].replace(0, np.nan)
您可以使用 'replace' 方法并将列表中要替换的值作为第一个参数传递,并将所需的值作为第二个参数传递:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace(['0', 0], np.nan)
尝试:
df2.replace(to_replace={
'Weight':{0:np.nan},
'Height':{0:np.nan},
'BootSize':{'0':np.nan},
'SuitSize':{'0':np.nan},
})
另一种替代方式:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].mask(df2[cols].eq(0) | df2[cols].eq('0'))
在“年龄”列中,将零替换为空格
df['age'].replace(['0', 0'], '', inplace=True)
将单列的零替换为 nan
df['age'] = df['age'].replace(0, np.nan)
用 nan 替换多列的零
cols = ["Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI"]
df[cols] = df[cols].replace(['0', 0], np.nan)
用数据框的 nan 替换零
df.replace(0, np.nan, inplace=True)
如果您只想替换整个数据框中的零,您可以直接替换它们而无需指定任何列:
df = df.replace({0:pd.NA})