从 Dataframe 中提取多列,并为不存在的列提取 Return NaN
Extract Multiple Columns from Dataframe and Return NaN for Columns that do not Exist
我正在尝试从数据框中提取多列,如下所示。我想通过调用它们的名称来识别所需的列,并为数据框中不存在的列调用 return NaN。
data_1 = {'host_identity_verified':['t','t','t','t','t','t','t','t','t','t'],
'neighbourhood':['q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q'],
'neighbourhood_cleansed':['Oostelijk Havengebied - Indische Buurt', 'Centrum-Oost', 'Centrum-West', 'Centrum-West', 'Centrum-West',
'Oostelijk Havengebied - Indische Buurt', 'Centrum-Oost', 'Centrum-West', 'Centrum-West', 'Centrum-West'],
'neighbourhood_group_cleansed': ['NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN'],
'latitude':[ 52.36575, 52.36509, 52.37297, 52.38761, 52.36719, 52.36575, 52.36509, 52.37297, 52.38761, 52.36719]}
df_1 = pd.DataFrame(data_1)
我知道这种获取一列的方法:
x = df_1.get('neighbourhood_cleansed', pd.Series(index=df_1.index, name='neighbourhood_cleansed', dtype='object'))
但是我用这个方法一次只能得到一列。
我想做类似的事情:
columns_needed = [['host_identity_verified', 'neighbourhood', 'latitude', 'longitude', 'price']]
# x= some code to get me the columns above and return NaN for columns such as 'longitude' and 'price.
使用 reindex
函数将创建 naan
列并提取您需要的列:
df_1.reindex(['host_identity_verified', 'neighbourhood', 'latitude', 'longitude', 'price'], axis=1)
我正在尝试从数据框中提取多列,如下所示。我想通过调用它们的名称来识别所需的列,并为数据框中不存在的列调用 return NaN。
data_1 = {'host_identity_verified':['t','t','t','t','t','t','t','t','t','t'],
'neighbourhood':['q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q'],
'neighbourhood_cleansed':['Oostelijk Havengebied - Indische Buurt', 'Centrum-Oost', 'Centrum-West', 'Centrum-West', 'Centrum-West',
'Oostelijk Havengebied - Indische Buurt', 'Centrum-Oost', 'Centrum-West', 'Centrum-West', 'Centrum-West'],
'neighbourhood_group_cleansed': ['NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN'],
'latitude':[ 52.36575, 52.36509, 52.37297, 52.38761, 52.36719, 52.36575, 52.36509, 52.37297, 52.38761, 52.36719]}
df_1 = pd.DataFrame(data_1)
我知道这种获取一列的方法:
x = df_1.get('neighbourhood_cleansed', pd.Series(index=df_1.index, name='neighbourhood_cleansed', dtype='object'))
但是我用这个方法一次只能得到一列。
我想做类似的事情:
columns_needed = [['host_identity_verified', 'neighbourhood', 'latitude', 'longitude', 'price']]
# x= some code to get me the columns above and return NaN for columns such as 'longitude' and 'price.
使用 reindex
函数将创建 naan
列并提取您需要的列:
df_1.reindex(['host_identity_verified', 'neighbourhood', 'latitude', 'longitude', 'price'], axis=1)