替换值数据框 python
Replace values dataframe python
我想替换数据框中满足条件的一些值。
我尝试编写代码但似乎不起作用
dfa = df.copy()
for value in df['Clean Company Name']:
if value=="NaN":
dfa['Clean Company Name'].replace(df['Company Name'])
dfa.head()
如您所见,NaN 值未被 'Company Name'
替换
我如何实现该结果?
如果需要替换 NaN
值需要函数 combine_first
or fillna
:
df['Clean Company Name'].combine_first(df['Company Name'])
或:
df['Clean Company Name'].fillna(df['Company Name'])
样本:
df = pd.DataFrame({'Company Name':['s','d','f'], 'Clean Company Name': [np.nan, 'r', 't']})
print (df)
Clean Company Name Company Name
0 NaN s
1 r d
2 t f
#if need check NaNs
print (df['Clean Company Name'].isnull())
0 True
1 False
2 False
Name: Clean Company Name, dtype: bool
df['Clean Company Name'] = df['Clean Company Name'].combine_first(df['Company Name'])
print (df)
Clean Company Name Company Name
0 s s
1 r d
2 t f
更多关于 missing data。
编辑:
对于按条件替换数据是可能的,使用 loc
和 boolean mask
:
print (df['Company Name'] == 'd')
0 False
1 True
2 False
Name: Company Name, dtype: bool
df.loc[df['Company Name'] == 'd', 'Clean Company Name'] = 'sss'
print (df)
Clean Company Name Company Name
0 NaN s
1 sss d
2 t f
我想替换数据框中满足条件的一些值。 我尝试编写代码但似乎不起作用
dfa = df.copy()
for value in df['Clean Company Name']:
if value=="NaN":
dfa['Clean Company Name'].replace(df['Company Name'])
dfa.head()
如您所见,NaN 值未被 'Company Name'
替换我如何实现该结果?
如果需要替换 NaN
值需要函数 combine_first
or fillna
:
df['Clean Company Name'].combine_first(df['Company Name'])
或:
df['Clean Company Name'].fillna(df['Company Name'])
样本:
df = pd.DataFrame({'Company Name':['s','d','f'], 'Clean Company Name': [np.nan, 'r', 't']})
print (df)
Clean Company Name Company Name
0 NaN s
1 r d
2 t f
#if need check NaNs
print (df['Clean Company Name'].isnull())
0 True
1 False
2 False
Name: Clean Company Name, dtype: bool
df['Clean Company Name'] = df['Clean Company Name'].combine_first(df['Company Name'])
print (df)
Clean Company Name Company Name
0 s s
1 r d
2 t f
更多关于 missing data。
编辑:
对于按条件替换数据是可能的,使用 loc
和 boolean mask
:
print (df['Company Name'] == 'd')
0 False
1 True
2 False
Name: Company Name, dtype: bool
df.loc[df['Company Name'] == 'd', 'Clean Company Name'] = 'sss'
print (df)
Clean Company Name Company Name
0 NaN s
1 sss d
2 t f