pandas:return 系列中每个字符串值的所有匹配键
pandas: return all matched keys for each strings value in a series
如何 return 搜索列表中的所有匹配键作为逗号分隔值。
例如,
s = pd.Series(['cat dog','hat cat','dog','fog cat','pet'])
searchfor = ['cat', 'dog']
我想要这个:
['cat, dog', 'cat', 'dog', 'cat', 'None']
先 split
然后做 str.contains
s1=s.str.split(' ',expand=True).stack()
s1[s1.str.contains('|'.join(searchfor))].groupby(level=0).apply(' '.join).reindex(s.index)
Out[778]:
0 cat dog
1 cat
2 dog
3 cat
4 NaN
dtype: object
我的同事帮我做了。
这是我最终做到的:
s = pandas.Series(['cat dog','hat cat','dog','fog cat','pet'])
searchfor = ['cat', 'dog']
b = ['']*len(s)
for i in numpy.arange(0,len(s)):
for j in numpy.arange(0,len(searchfor)):
b[i] = b[i] + ', ' + searchfor[j] if searchfor[j] in s[i] and b[i]!= '' else (searchfor[j] if searchfor[j] in s[i] else b[i])
df = DataFrame({'s': s, 'searchfor': [numpy.nan if i=='' else i for i in b]})
df
s searchfor
0 cat dog cat, dog
1 hat cat cat
2 dog dog
3 fog cat cat
4 pet NaN
如何 return 搜索列表中的所有匹配键作为逗号分隔值。
例如,
s = pd.Series(['cat dog','hat cat','dog','fog cat','pet'])
searchfor = ['cat', 'dog']
我想要这个:
['cat, dog', 'cat', 'dog', 'cat', 'None']
先 split
然后做 str.contains
s1=s.str.split(' ',expand=True).stack()
s1[s1.str.contains('|'.join(searchfor))].groupby(level=0).apply(' '.join).reindex(s.index)
Out[778]:
0 cat dog
1 cat
2 dog
3 cat
4 NaN
dtype: object
我的同事帮我做了。 这是我最终做到的:
s = pandas.Series(['cat dog','hat cat','dog','fog cat','pet'])
searchfor = ['cat', 'dog']
b = ['']*len(s)
for i in numpy.arange(0,len(s)):
for j in numpy.arange(0,len(searchfor)):
b[i] = b[i] + ', ' + searchfor[j] if searchfor[j] in s[i] and b[i]!= '' else (searchfor[j] if searchfor[j] in s[i] else b[i])
df = DataFrame({'s': s, 'searchfor': [numpy.nan if i=='' else i for i in b]})
df
s searchfor
0 cat dog cat, dog
1 hat cat cat
2 dog dog
3 fog cat cat
4 pet NaN