python pandas 检查列是否包含列表中的项目

python pandas check column contains item from a list

我有两个数据框

vid     vbull   
1125    RHSA:2017:3200   
1127    RHSA:2017:3205  
1128    RHSA:2017:3208   
1129    RHSA:2017:3209


kbid    vdesc   
2401    This contains details for RHSA:2017:3205   
2402    This contains details for RHSA:2017:3206   
2403    This contains details forRHSA:2017:3207   
2404    This contains details for RHSA:2017:3208  
2405    This contains details for RHSA:2017:3200

需要 df1、df2 的输出来匹配 vdesc 中的 vbull,例如:

vid   vbull           kbid   vdesc   
1125  RHSA:2017:3200  2405   This contains details for RHSA:2017:3200   
1127  RHSA:2017:3207  2403  This contains details for RHSA:2017:3207   ...

尝试通过此方法获取匹配项,但不确定如何在输出中也获取匹配项

df2[df2.vdesc.str.contains('|'.join(df1.vbull))]    

首先对 vbull 中的值使用 extract

df2['extracted'] = df2.vdesc.str.extract('(' + '|'.join(df1.vbull) + ')', expand=False)
print (df2)
   kbid                                     vdesc       extracted
0  2401  This contains details for RHSA:2017:3205  RHSA:2017:3205
1  2402  This contains details for RHSA:2017:3206             NaN
2  2403  This contains details for RHSA:2017:3207             NaN
3  2404  This contains details for RHSA:2017:3208  RHSA:2017:3208
4  2405  This contains details for RHSA:2017:3200  RHSA:2017:3200

然后按boolean indexing过滤:

df3 = df2[df2['extracted'].notnull()].copy()
print (df3)
   kbid                                     vdesc       extracted
0  2401  This contains details for RHSA:2017:3205  RHSA:2017:3205
3  2404  This contains details for RHSA:2017:3208  RHSA:2017:3208
4  2405  This contains details for RHSA:2017:3200  RHSA:2017:3200

最后通过 map 添加 vid 的值:

df3['new'] = df3['extracted'].map(df1.set_index('vbull')['vid'])
print (df3)
   kbid                                     vdesc       extracted   new
0  2401  This contains details for RHSA:2017:3205  RHSA:2017:3205  1127
3  2404  This contains details for RHSA:2017:3208  RHSA:2017:3208  1128
4  2405  This contains details for RHSA:2017:3200  RHSA:2017:3200  1125