df.isna().sum() 不计算 nan 值
df.isna().sum() not counting nan values
我在 Dataframe 中有几个值,如下所示:
Price_(zł) Area_(m2) Rooms Market Building_type Flat_level
0 1264850 62 3 secondary apartment building 7
1 790000 80 4 secondary block 0
2 606128 73,28 3 new block 5
3 499000 70,50 4 secondary nan nan
4 519000 40,86 2 new block 5
5 508240 58,40 4 new block 0
6 447568 50,86 3 new block 0
7 Zapytajocenę 58,50 3 new nan 6
8 739375 84,50 4 new apartment building 3
9 322400 52 3 new nan 1
来自:
df['Flat_level'] = df['Flat_level'].apply(lambda x: str(x).replace (' parter', '0') if x != np.NaN else x == np.NaN)
df['Flat_level'] = df['Flat_level'].apply(lambda x: str(x).replace (' suterena', '-1') if x != np.NaN else x == np.NaN)
df['Flat_level'] = df['Flat_level'].apply(lambda x: str(x).replace (' > 10', '20') if x != np.NaN else x == np.NaN)
df['Flat_level'] = df['Flat_level'].apply(lambda x: str(x).replace (' poddasze', '30') if x != np.NaN else x == np.NaN)
在这些之前的更改之前:
类型:
type(df['Flat_level'][3])
float
尝试计算 NaN 值时:
df.isna().sum()
“Flat_level”列没有 'NaN' 值:
Price_(zł) 0
Area_(m2) 0
Rooms 0
Market 0
Building_type 0
Flat_level 0
Building_flat_levels 1249
Windows 0
Heating 0
Year_of_construction 1734
Finishing_level 0
Property_form 0
Construction_materials 0
latitude 0
longitude 0
link 0
dtype: int64
知道为什么吗?
谢谢
您最好使用 numpy 函数 np.isnan()
而不是原生 Python 来了解值是否为 nan。您还需要更新 apply()
方法的末尾,否则您的数据框中将只有布尔值而不是 nan 值。你可以这样做:
df['Flat_level'] = df['Flat_level'].apply(
lambda x: str(x).replace (' parter', '0') if (type(x) == float and not np.isnan(x)) or type(x)!=float else np.NaN
)
我在 Dataframe 中有几个值,如下所示:
Price_(zł) Area_(m2) Rooms Market Building_type Flat_level
0 1264850 62 3 secondary apartment building 7
1 790000 80 4 secondary block 0
2 606128 73,28 3 new block 5
3 499000 70,50 4 secondary nan nan
4 519000 40,86 2 new block 5
5 508240 58,40 4 new block 0
6 447568 50,86 3 new block 0
7 Zapytajocenę 58,50 3 new nan 6
8 739375 84,50 4 new apartment building 3
9 322400 52 3 new nan 1
来自:
df['Flat_level'] = df['Flat_level'].apply(lambda x: str(x).replace (' parter', '0') if x != np.NaN else x == np.NaN)
df['Flat_level'] = df['Flat_level'].apply(lambda x: str(x).replace (' suterena', '-1') if x != np.NaN else x == np.NaN)
df['Flat_level'] = df['Flat_level'].apply(lambda x: str(x).replace (' > 10', '20') if x != np.NaN else x == np.NaN)
df['Flat_level'] = df['Flat_level'].apply(lambda x: str(x).replace (' poddasze', '30') if x != np.NaN else x == np.NaN)
在这些之前的更改之前:
类型:
type(df['Flat_level'][3])
float
尝试计算 NaN 值时:
df.isna().sum()
“Flat_level”列没有 'NaN' 值:
Price_(zł) 0
Area_(m2) 0
Rooms 0
Market 0
Building_type 0
Flat_level 0
Building_flat_levels 1249
Windows 0
Heating 0
Year_of_construction 1734
Finishing_level 0
Property_form 0
Construction_materials 0
latitude 0
longitude 0
link 0
dtype: int64
知道为什么吗? 谢谢
您最好使用 numpy 函数 np.isnan()
而不是原生 Python 来了解值是否为 nan。您还需要更新 apply()
方法的末尾,否则您的数据框中将只有布尔值而不是 nan 值。你可以这样做:
df['Flat_level'] = df['Flat_level'].apply(
lambda x: str(x).replace (' parter', '0') if (type(x) == float and not np.isnan(x)) or type(x)!=float else np.NaN
)