在一行中打印 MultiIndex pd.DataFrame 每个级别的所有行
Printing all rows in each level of MultiIndex pd.DataFrame in one row
我有一个数据帧,在执行 groupby()
和 aggregation 后转换为 multiIndex 数据帧。
In[1]:
mydata = [['Team1', 'Player1', 'idTrip13', 133], ['Team2', 'Player333', 'idTrip10', 18373],
['Team3', 'Player22', 'idTrip12', 17338899], ['Team2', 'Player293','idTrip02', 17656],
['Team3', 'Player20', 'idTrip11', 1883], ['Team1', 'Player1', 'idTrip19', 19393]]
df = pd.DataFrame(mydata, columns = ['team', 'player', 'trips', 'time'])
df
Out[1]:
team player trips time
0 Team1 Player1 idTrip13 133
1 Team2 Player333 idTrip10 18373
2 Team3 Player22 idTrip12 17338899
3 Team2 Player293 idTrip02 17656
4 Team3 Player20 idTrip11 1883
5 Team1 Player1 idTrip19 19393
对于团队中的每个球员,找出旅行的总次数和旅行所花费的总时间。这 returns 一个 multiIndex 数据框。
player_total = df.groupby(by = ['team', 'player']).agg({'time' : 'sum', 'trips' : 'count'})
player_total
Out[4]:
trips time
team player
Team1 Player1 2 19526
Team2 Player293 1 17656
Player333 1 18373
Team3 Player20 1 1883
Player22 1 17338899
期望输出:
我想打印输出,使团队中的所有玩家都在同一条线上。
Team1 Player1 : 2 trips : 19526;
Team2 Player293 : 1 : 17656; Player333 : 1 : 18373;
Team3 Player22 : 1 trip : 17338899; Player20 : 1 trip : 1883
这个 被认为过于宽泛,所以我冒昧地将 pandas 数据帧创建/聚合与输出打印分开。
使用 groupby()
.
遍历第 0 级(团队)
for team, df2 in player_total.groupby(level = 0):
例如在第二次迭代时,它将 return 一个数据帧 Team2
:
trips time
team player
Team2 Player293 1 17656
Player333 1 18373
使用reset_index()
删除球队索引列并将球员索引列作为数据框的一部分。
>>>team_df = df2.reset_index(level = 0, drop = True).reset_index()
>>>team_df
player trips time
0 Player293 1 17656
1 Player333 1 18373
将该数据帧转换为列表列表,以便我们可以遍历每个玩家。
team_df.values.tolist()
>>>[['Player293', 1, 17656], ['Player333', 1, 18373]]
打印时我们必须将整数映射为字符串,并使用打印函数的结束参数打印分号而不是在末尾打印新行。
>>>for player in team_df.values.tolist():
print(': '.join(map(str, player)), end = '; ')
>>>Player293: 1: 17656; Player333: 1: 18373;
完整解法:
from __future__ import print_function
#iterate through each team
for team, df2 in player_total.groupby(level = 0):
print(team, end = '\t')
#drop the 0th level (team) and move the first level (player) as the index
team_df = df2.reset_index(level = 0, drop = True).reset_index()
#iterate through each player on the team and print player, trip, and time
for player in team_df.values.tolist():
print(': '.join(map(str, player)), end = '; ')
#After printing all players insert a new line
print()
输出:
Player1: 2: 19526;
Player293: 1: 17656; Player333: 1: 18373;
Player20: 1: 1883; Player22: 1: 17338899;
我有一个数据帧,在执行 groupby()
和 aggregation 后转换为 multiIndex 数据帧。
In[1]:
mydata = [['Team1', 'Player1', 'idTrip13', 133], ['Team2', 'Player333', 'idTrip10', 18373],
['Team3', 'Player22', 'idTrip12', 17338899], ['Team2', 'Player293','idTrip02', 17656],
['Team3', 'Player20', 'idTrip11', 1883], ['Team1', 'Player1', 'idTrip19', 19393]]
df = pd.DataFrame(mydata, columns = ['team', 'player', 'trips', 'time'])
df
Out[1]:
team player trips time
0 Team1 Player1 idTrip13 133
1 Team2 Player333 idTrip10 18373
2 Team3 Player22 idTrip12 17338899
3 Team2 Player293 idTrip02 17656
4 Team3 Player20 idTrip11 1883
5 Team1 Player1 idTrip19 19393
对于团队中的每个球员,找出旅行的总次数和旅行所花费的总时间。这 returns 一个 multiIndex 数据框。
player_total = df.groupby(by = ['team', 'player']).agg({'time' : 'sum', 'trips' : 'count'})
player_total
Out[4]:
trips time
team player
Team1 Player1 2 19526
Team2 Player293 1 17656
Player333 1 18373
Team3 Player20 1 1883
Player22 1 17338899
期望输出: 我想打印输出,使团队中的所有玩家都在同一条线上。
Team1 Player1 : 2 trips : 19526;
Team2 Player293 : 1 : 17656; Player333 : 1 : 18373;
Team3 Player22 : 1 trip : 17338899; Player20 : 1 trip : 1883
这个
使用
遍历第 0 级(团队)groupby()
.for team, df2 in player_total.groupby(level = 0):
例如在第二次迭代时,它将 return 一个数据帧
Team2
:trips time team player Team2 Player293 1 17656 Player333 1 18373
使用
reset_index()
删除球队索引列并将球员索引列作为数据框的一部分。>>>team_df = df2.reset_index(level = 0, drop = True).reset_index() >>>team_df player trips time 0 Player293 1 17656 1 Player333 1 18373
将该数据帧转换为列表列表,以便我们可以遍历每个玩家。
team_df.values.tolist() >>>[['Player293', 1, 17656], ['Player333', 1, 18373]]
打印时我们必须将整数映射为字符串,并使用打印函数的结束参数打印分号而不是在末尾打印新行。
>>>for player in team_df.values.tolist(): print(': '.join(map(str, player)), end = '; ') >>>Player293: 1: 17656; Player333: 1: 18373;
完整解法:
from __future__ import print_function
#iterate through each team
for team, df2 in player_total.groupby(level = 0):
print(team, end = '\t')
#drop the 0th level (team) and move the first level (player) as the index
team_df = df2.reset_index(level = 0, drop = True).reset_index()
#iterate through each player on the team and print player, trip, and time
for player in team_df.values.tolist():
print(': '.join(map(str, player)), end = '; ')
#After printing all players insert a new line
print()
输出:
Player1: 2: 19526;
Player293: 1: 17656; Player333: 1: 18373;
Player20: 1: 1883; Player22: 1: 17338899;