pandas 查找 Series 和 return 关键字中的共同字符串
pandas find strings in common among Series and return keywords
我想改进 关于基于一系列关键字在 pandas 系列中搜索字符串的功能。我现在的问题是如何将在 DataFrame 行中找到的关键字作为新列。关键字系列 "w" 是:
Skilful
Wilful
Somewhere
Thing
Strange
DataFrame "df" 是:
User_ID;Tweet
01;hi all
02;see you somewhere
03;So weird
04;hi all :-)
05;next big thing
06;how can i say no?
07;so strange
08;not at all
以下解决方案可以很好地屏蔽 DataFrame:
import re
r = re.compile(r'.*({}).*'.format('|'.join(w.values)), re.IGNORECASE)
masked = map(bool, map(r.match, df['Tweet']))
df['Tweet_masked'] = masked
和return这个:
User_ID Tweet Tweet_masked
0 1 hi all False
1 2 see you somewhere True
2 3 So weird False
3 4 hi all :-) False
4 5 next big thing True
5 6 how can i say no? False
6 7 so strange True
7 8 not at all False
现在我正在寻找这样的结果:
User_ID;Tweet;Keyword
01;hi all;None
02;see you somewhere;somewhere
03;So weird;None
04;hi all :-);None
05;next big thing;thing
06;how can i say no?;None
07;so strange;strange
08;not at all;None
在此先感谢您的支持。
更换怎么样
masked = map(bool, map(r.match, df['Tweet']))
和
masked = [m.group(1) if m else None for m in map(r.match, df['Tweet'])]
我想改进
Skilful
Wilful
Somewhere
Thing
Strange
DataFrame "df" 是:
User_ID;Tweet
01;hi all
02;see you somewhere
03;So weird
04;hi all :-)
05;next big thing
06;how can i say no?
07;so strange
08;not at all
以下解决方案可以很好地屏蔽 DataFrame:
import re
r = re.compile(r'.*({}).*'.format('|'.join(w.values)), re.IGNORECASE)
masked = map(bool, map(r.match, df['Tweet']))
df['Tweet_masked'] = masked
和return这个:
User_ID Tweet Tweet_masked
0 1 hi all False
1 2 see you somewhere True
2 3 So weird False
3 4 hi all :-) False
4 5 next big thing True
5 6 how can i say no? False
6 7 so strange True
7 8 not at all False
现在我正在寻找这样的结果:
User_ID;Tweet;Keyword
01;hi all;None
02;see you somewhere;somewhere
03;So weird;None
04;hi all :-);None
05;next big thing;thing
06;how can i say no?;None
07;so strange;strange
08;not at all;None
在此先感谢您的支持。
更换怎么样
masked = map(bool, map(r.match, df['Tweet']))
和
masked = [m.group(1) if m else None for m in map(r.match, df['Tweet'])]