如何计算作为 df 中一列的嵌套列表中的频率?
How to count frequencies from a nested list that is a column in a df?
我想计算嵌套列表中某个项目的出现次数。 Pandas df的当前结构;每条记录按 match_id 和 possession_id 分组,然后将值第二个 action_name、player_name 传递到名为 action_seq.
的列表
我可以计算每次拥有的事件总数没问题,但我现在希望能够计算次数,例如玩家 A 参与过事件?他们在哪些事件中发生的频率更高?
#sample df
pass_goal = pd.DataFrame({'match_id': [1107073,1107073,1107073,1409630,1409630],
'possession_number': [2,2,2,40,40], 'second': [10,15,20,250,260],
'action_name': ['pass', 'pass', 'goal','pass','goal'],
'player_name': ['a','b','c','a','b']})
#grouping by match and possession then adding a list
posses = pass_goal.groupby(['match_id','possession_number'])[['second', 'action_name','player_name']].apply(lambda action: action.values.tolist()).reset_index(name='action_seq')
首选输出
Player A B C
Pass 2 1 0
Goal 0 1 1
你可以试试:
(pass_goal[["action_name","player_name"]]
.pivot_table(columns="player_name", index="action_name", aggfunc=len, fill_value=0)
.rename_axis(index="", columns="player"))
player a b c
goal 0 1 1
pass 2 1 0
我想计算嵌套列表中某个项目的出现次数。 Pandas df的当前结构;每条记录按 match_id 和 possession_id 分组,然后将值第二个 action_name、player_name 传递到名为 action_seq.
的列表我可以计算每次拥有的事件总数没问题,但我现在希望能够计算次数,例如玩家 A 参与过事件?他们在哪些事件中发生的频率更高?
#sample df
pass_goal = pd.DataFrame({'match_id': [1107073,1107073,1107073,1409630,1409630],
'possession_number': [2,2,2,40,40], 'second': [10,15,20,250,260],
'action_name': ['pass', 'pass', 'goal','pass','goal'],
'player_name': ['a','b','c','a','b']})
#grouping by match and possession then adding a list
posses = pass_goal.groupby(['match_id','possession_number'])[['second', 'action_name','player_name']].apply(lambda action: action.values.tolist()).reset_index(name='action_seq')
首选输出
Player A B C
Pass 2 1 0
Goal 0 1 1
你可以试试:
(pass_goal[["action_name","player_name"]]
.pivot_table(columns="player_name", index="action_name", aggfunc=len, fill_value=0)
.rename_axis(index="", columns="player"))
player a b c
goal 0 1 1
pass 2 1 0