如何制作两个分组列的组合直方图?

How to make a combined histogram of two grouped columns?

My data
我附上了这些数据,我正在尝试分别重叠 each 团队的主场和客场直方图?我是 python 顺便说一句。

到目前为止,我制作的看起来正是我想要的,但我想每个团队再次组合它们:

df_EPL['Away_score'].hist(by=df_EPL['AwayTeam'],figsize = (8,8),color = '#96ddff');

df_EPL['Home_score'].hist(by=df_EPL['HomeTeam'],figsize = (8,8),color = '#82c065');

创建虚假数据框

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns


np.random.seed(42)
teams = ['Arsenal', 'Chelsea', 'Liverpool', 'Manchester City', 'Manchester Utd']

df = pd.DataFrame({'HomeTeam': np.repeat(teams, len(teams) - 1)})
df['AwayTeam'] = [away_team for home_team in teams for away_team in teams if away_team != home_team]
df['Home_score'] = np.random.randint(0, 5, len(df))
df['Away_score'] = np.random.randint(0, 5, len(df))
           HomeTeam         AwayTeam  Home_score  Away_score
0           Arsenal          Chelsea           3           1
1           Arsenal        Liverpool           4           4
2           Arsenal  Manchester City           2           3
3           Arsenal   Manchester Utd           4           0
4           Chelsea          Arsenal           4           0
5           Chelsea        Liverpool           1           2
6           Chelsea  Manchester City           2           2
7           Chelsea   Manchester Utd           2           1
8         Liverpool          Arsenal           2           3
9         Liverpool          Chelsea           4           3
10        Liverpool  Manchester City           3           2
11        Liverpool   Manchester Utd           2           3
12  Manchester City          Arsenal           4           3
13  Manchester City          Chelsea           1           0
14  Manchester City        Liverpool           3           2
15  Manchester City   Manchester Utd           1           4
16   Manchester Utd          Arsenal           3           2
17   Manchester Utd          Chelsea           4           4
18   Manchester Utd        Liverpool           0           0
19   Manchester Utd  Manchester City           3           1

数据框重塑

您需要以不同的格式重新塑造您的数据框,以便绘制您想要的图。为此,您可以使用 pandas.melt:

df = pd.melt(frame = df,
             id_vars = ['HomeTeam', 'AwayTeam'],
             var_name = 'H/A',
             value_name = 'Score')

df = df.drop('AwayTeam', axis = 1).rename(columns = {'HomeTeam': 'Team'}).replace({'Home_score': 'Home', 'Away_score': 'Away'})
               Team   H/A  Score
0           Arsenal  Home      3
1           Arsenal  Home      4
2           Arsenal  Home      2
3           Arsenal  Home      4
4           Chelsea  Home      4
5           Chelsea  Home      1
6           Chelsea  Home      2
7           Chelsea  Home      2
8         Liverpool  Home      2
9         Liverpool  Home      4
10        Liverpool  Home      3
11        Liverpool  Home      2
12  Manchester City  Home      4
13  Manchester City  Home      1
14  Manchester City  Home      3
15  Manchester City  Home      1
16   Manchester Utd  Home      3
17   Manchester Utd  Home      4
18   Manchester Utd  Home      0
19   Manchester Utd  Home      3
20          Arsenal  Away      1
21          Arsenal  Away      4
22          Arsenal  Away      3
23          Arsenal  Away      0
24          Chelsea  Away      0
25          Chelsea  Away      2
26          Chelsea  Away      2
27          Chelsea  Away      1
28        Liverpool  Away      3
29        Liverpool  Away      3
30        Liverpool  Away      2
31        Liverpool  Away      3
32  Manchester City  Away      3
33  Manchester City  Away      0
34  Manchester City  Away      2
35  Manchester City  Away      4
36   Manchester Utd  Away      2
37   Manchester Utd  Away      4
38   Manchester Utd  Away      0
39   Manchester Utd  Away      1

情节

现在可以绘制数据框了。您可以使用 seaborn.FacetGrid to create the grid of subplots, one for each team. Each subplot will have two seaborn.histplot:一个用于 Home_score,一个用于 Away_score

g = sns.FacetGrid(df, col = 'Team', hue = 'H/A')
g.map(sns.histplot, 'Score', bins = np.arange(df['Score'].min() - 0.5, df['Score'].max() + 1.5, 1))
g.add_legend()
g.set(xticks = np.arange(df['Score'].min(), df['Score'].max() + 1, 1))

plt.show()