将行、列值转换为字典和数据框 pandas
Convert a row, column value to dict and to a dataframe pandas
python 菜鸟在这里。
我有一个数据框 people
,其中 name
和 text
作为两列。
name text
0 Obama Obama was the 44th president of the...
1 Trump Donald J. Trump ran as a republican...
我只需要对 Obama
进行一些探索性分析。
obama= people[people['name'] == 'Obama'].copy()
obama.text
35817 Obama was the 44th president of the unit...
Name: text, dtype: object
如何将文本转换为字典作为新列,键作为单词,单词计数作为值?
示例:
name text dictionary
0 Obama Obama was the 44th president of the... {'Obama':1, 'the':2,...}
完成后,如何将字典转换为单独的数据框?
预期:
word count
0 Obama 1
1 the 2
您可以使用集合模块中的 Counter
对象:
import collections
people['dictionary'] = people.text.apply(lambda x: dict(collections.Counter(x.split())))
要将其中一个字典转换为数据框:
dictionary = people['dictionary'][0]
pd.DataFrame(data={'word': dictionary.keys(), 'count': dictionary.values()})
python 菜鸟在这里。
我有一个数据框 people
,其中 name
和 text
作为两列。
name text
0 Obama Obama was the 44th president of the...
1 Trump Donald J. Trump ran as a republican...
我只需要对 Obama
进行一些探索性分析。
obama= people[people['name'] == 'Obama'].copy()
obama.text
35817 Obama was the 44th president of the unit...
Name: text, dtype: object
如何将文本转换为字典作为新列,键作为单词,单词计数作为值?
示例:
name text dictionary
0 Obama Obama was the 44th president of the... {'Obama':1, 'the':2,...}
完成后,如何将字典转换为单独的数据框?
预期:
word count
0 Obama 1
1 the 2
您可以使用集合模块中的 Counter
对象:
import collections
people['dictionary'] = people.text.apply(lambda x: dict(collections.Counter(x.split())))
要将其中一个字典转换为数据框:
dictionary = people['dictionary'][0]
pd.DataFrame(data={'word': dictionary.keys(), 'count': dictionary.values()})