使用其他列的字符串过滤器创建新的数据框列
Creating new dataframe column using string filter of other column
下面是列名为 'Address' 的数据框。我想使用地址列中的过滤器创建一个包含特定字符串的单独列 'City'。
df1
Serial_No Address
1 India Gate Delhi
2 Delhi Redcross Hospital
3 Tolleyganj Bus Stand Kolkata
4 Kolkata Howrah
5 Katra Jammu
下面是我正在使用的脚本
descr = []
col = 'City'
for col in df:
if np.series(df[col]= df[df[col].str.contains('Delhi ', na=False)]:
desc = 'Delhi'
elif np.series(df[col]= df[df[col].str.contains('Kolkata ', na=False)]:
desc = 'Kolkata'
else:
desc = 'None'
下面是预期的输出
df1
Serial_No Address City
1 India Gate Delhi Delhi
2 Delhi Redcross Hospital Delhi
3 Tolleyganj Bus Stand Kolkata Kolkata
4 Kolkata Howrah Kolkata
5 Katra Jammu None
让我们试试str.extract
df['new'] = df.Address.str.extract(('(Delhi|Kolkata)'))[0]
试试这个
import pandas as pd
df1=pd.DataFrame([[1,'India Gate Delhi'],[2,'Delhi Redcross Hospital'],[3,'Tolleyganj Bus Stand Kolkata'],[4,'Kolkata Howrah'],[5,'Katra Jammu']],columns=['Serial_No','Address'])
print(df1)
def f(df1):
if 'Delhi' in df1['Address']:
val = 'Delhi'
elif 'Kolkata' in df1['Address']:
val = 'Kolkata'
else:
val = 'None'
return val
df1['City'] = df1.apply(f, axis=1)
print(df1)
下面是列名为 'Address' 的数据框。我想使用地址列中的过滤器创建一个包含特定字符串的单独列 'City'。
df1
Serial_No Address
1 India Gate Delhi
2 Delhi Redcross Hospital
3 Tolleyganj Bus Stand Kolkata
4 Kolkata Howrah
5 Katra Jammu
下面是我正在使用的脚本
descr = []
col = 'City'
for col in df:
if np.series(df[col]= df[df[col].str.contains('Delhi ', na=False)]:
desc = 'Delhi'
elif np.series(df[col]= df[df[col].str.contains('Kolkata ', na=False)]:
desc = 'Kolkata'
else:
desc = 'None'
下面是预期的输出
df1
Serial_No Address City
1 India Gate Delhi Delhi
2 Delhi Redcross Hospital Delhi
3 Tolleyganj Bus Stand Kolkata Kolkata
4 Kolkata Howrah Kolkata
5 Katra Jammu None
让我们试试str.extract
df['new'] = df.Address.str.extract(('(Delhi|Kolkata)'))[0]
试试这个
import pandas as pd
df1=pd.DataFrame([[1,'India Gate Delhi'],[2,'Delhi Redcross Hospital'],[3,'Tolleyganj Bus Stand Kolkata'],[4,'Kolkata Howrah'],[5,'Katra Jammu']],columns=['Serial_No','Address'])
print(df1)
def f(df1):
if 'Delhi' in df1['Address']:
val = 'Delhi'
elif 'Kolkata' in df1['Address']:
val = 'Kolkata'
else:
val = 'None'
return val
df1['City'] = df1.apply(f, axis=1)
print(df1)