pandas 和 seaborn 中的分组箱线图
Grouped boxplots in pandas and seaborn
我知道了。数据框:
season A B C D
0 current 26.978912 0.039233 1.248607 0.025874
1 current 26.978912 0.039233 0.836786 0.025874
2 current 26.978912 0.039233 3.047536 0.025874
3 current 26.978912 0.039233 3.726964 0.025874
4 current 26.978912 0.039233 1.171393 0.025874
5 current 26.978912 0.039233 0.180929 0.025874
6 current 26.978912 0.039233 0.000000 0.025874
7 current 34.709560 0.039233 0.700893 0.025874
8 current 111.140200 0.306142 3.068286 0.169244
9 current 111.140200 0.306142 2.931107 0.169244
10 current 111.140200 0.306142 2.121893 0.169244
11 current 111.140200 0.306142 1.479464 0.169244
12 current 111.140200 0.306142 2.186821 0.169244
13 current 111.140200 0.306142 9.542714 0.169244
14 current 111.140200 0.306142 9.890750 0.169244
15 current 111.140200 0.306142 8.864857 0.169244
16 past 88.176415 0.257901 3.416059 0.141809
17 past 88.176415 0.257901 4.835357 0.141809
18 past 88.176415 0.257901 5.238097 0.141809
19 past 88.176415 0.257901 5.535355 0.141809
20 past 88.176415 0.257901 6.479523 0.141809
21 past 88.176415 0.257901 7.727862 0.141809
22 past 88.176415 0.257901 8.046811 0.141809
23 past 94.037913 0.308439 8.541000 0.163651
24 past 101.630141 0.363136 8.416895 0.192256
25 past 101.630141 0.363136 6.531005 0.192256
26 past 101.630141 0.363136 6.397497 0.192256
27 past 101.630141 0.363136 6.500077 0.192256
28 past 101.630141 0.363136 7.088469 0.192256
29 past 101.630141 0.363136 7.821852 0.192256
30 past 101.630141 0.363136 8.011082 0.192256
31 past 101.037817 0.417099 8.279735 0.212376
32 past 88.176415 0.257901 3.416059 0.141809
33 past 88.176415 0.257901 4.835357 0.141809
34 past 88.176415 0.257901 5.238097 0.141809
35 past 88.176415 0.257901 5.535355 0.141809
36 past 88.176415 0.257901 6.479523 0.141809
37 past 88.176415 0.257901 7.727862 0.141809
38 past 88.176415 0.257901 8.046811 0.141809
39 past 94.037913 0.308439 8.541000 0.163651
40 past 101.630141 0.363136 8.416895 0.192256
41 past 101.630141 0.363136 6.531005 0.192256
42 past 101.630141 0.363136 6.397497 0.192256
43 past 101.630141 0.363136 6.500077 0.192256
44 past 101.630141 0.363136 7.088469 0.192256
45 past 101.630141 0.363136 7.821852 0.192256
46 past 101.630141 0.363136 8.011082 0.192256
47 past 101.037817 0.417099 8.279735 0.212376
我是这样画的:
df.boxplot(by='season')
如何确保不同的面板具有不同的 y 轴最小值和最大值?另外,我如何在 seaborn 中执行此操作?
好的,所以您首先需要的是长格式数据。假设您从这个开始:
import numpy
import pandas
import seaborn
numpy.random.seed(0)
N = 100
seasons = ['winter', 'spring', 'summer', 'autumn']
df = pandas.DataFrame({
'season': numpy.random.choice(seasons, size=N),
'A': numpy.random.normal(4, 1.75, size=N),
'B': numpy.random.normal(4, 4.5, size=N),
'C': numpy.random.lognormal(0.5, 0.05, size=N),
'D': numpy.random.beta(3, 1, size=N)
})
print(df.sample(7))
A B C D season
85 7.236212 5.044815 1.845659 0.550943 autumn
13 4.749581 1.014348 1.707000 0.630618 autumn
0 1.014027 4.750031 1.637803 0.285781 winter
3 3.233370 8.250158 1.516189 0.973797 winter
44 6.062864 -0.969725 1.564768 0.954225 autumn
43 7.317806 -3.209259 1.699684 0.968950 spring
39 5.576446 -2.187281 1.735002 0.436692 winter
您可以使用 pandas.melt
函数将其转换为长格式数据。
lf = pandas.melt(df, value_vars=['A', 'B', 'C', 'D'], id_vars='season')
print(lf.sample(7))
season variable value
399 winter D 0.238061
227 spring C 1.656770
322 autumn D 0.933299
121 autumn B 4.393981
6 autumn A 1.175679
5 autumn A 5.360608
51 spring A 5.709118
然后你可以直接将所有内容输入 seaborn.factorplot
fg = (
pandas.melt(df, value_vars=['A', 'B', 'C', 'D'], id_vars='season')
.pipe(
(seaborn.factorplot, 'data'), # (<fxn>, <dataframe var>)
kind='box', # type of plot we want
x='season', x_order=seasons, # x-values of the plots
y='value', palette='BrBG_r', # y-values and colors
col='variable', col_wrap=2, # 'A-D' in columns, wrap at 2nd col
sharey=False # tailor y-axes for each group
notch=True, width=0.75, # kwargs passed to boxplot
)
)
这给了我:
我知道了。数据框:
season A B C D
0 current 26.978912 0.039233 1.248607 0.025874
1 current 26.978912 0.039233 0.836786 0.025874
2 current 26.978912 0.039233 3.047536 0.025874
3 current 26.978912 0.039233 3.726964 0.025874
4 current 26.978912 0.039233 1.171393 0.025874
5 current 26.978912 0.039233 0.180929 0.025874
6 current 26.978912 0.039233 0.000000 0.025874
7 current 34.709560 0.039233 0.700893 0.025874
8 current 111.140200 0.306142 3.068286 0.169244
9 current 111.140200 0.306142 2.931107 0.169244
10 current 111.140200 0.306142 2.121893 0.169244
11 current 111.140200 0.306142 1.479464 0.169244
12 current 111.140200 0.306142 2.186821 0.169244
13 current 111.140200 0.306142 9.542714 0.169244
14 current 111.140200 0.306142 9.890750 0.169244
15 current 111.140200 0.306142 8.864857 0.169244
16 past 88.176415 0.257901 3.416059 0.141809
17 past 88.176415 0.257901 4.835357 0.141809
18 past 88.176415 0.257901 5.238097 0.141809
19 past 88.176415 0.257901 5.535355 0.141809
20 past 88.176415 0.257901 6.479523 0.141809
21 past 88.176415 0.257901 7.727862 0.141809
22 past 88.176415 0.257901 8.046811 0.141809
23 past 94.037913 0.308439 8.541000 0.163651
24 past 101.630141 0.363136 8.416895 0.192256
25 past 101.630141 0.363136 6.531005 0.192256
26 past 101.630141 0.363136 6.397497 0.192256
27 past 101.630141 0.363136 6.500077 0.192256
28 past 101.630141 0.363136 7.088469 0.192256
29 past 101.630141 0.363136 7.821852 0.192256
30 past 101.630141 0.363136 8.011082 0.192256
31 past 101.037817 0.417099 8.279735 0.212376
32 past 88.176415 0.257901 3.416059 0.141809
33 past 88.176415 0.257901 4.835357 0.141809
34 past 88.176415 0.257901 5.238097 0.141809
35 past 88.176415 0.257901 5.535355 0.141809
36 past 88.176415 0.257901 6.479523 0.141809
37 past 88.176415 0.257901 7.727862 0.141809
38 past 88.176415 0.257901 8.046811 0.141809
39 past 94.037913 0.308439 8.541000 0.163651
40 past 101.630141 0.363136 8.416895 0.192256
41 past 101.630141 0.363136 6.531005 0.192256
42 past 101.630141 0.363136 6.397497 0.192256
43 past 101.630141 0.363136 6.500077 0.192256
44 past 101.630141 0.363136 7.088469 0.192256
45 past 101.630141 0.363136 7.821852 0.192256
46 past 101.630141 0.363136 8.011082 0.192256
47 past 101.037817 0.417099 8.279735 0.212376
我是这样画的:
df.boxplot(by='season')
如何确保不同的面板具有不同的 y 轴最小值和最大值?另外,我如何在 seaborn 中执行此操作?
好的,所以您首先需要的是长格式数据。假设您从这个开始:
import numpy
import pandas
import seaborn
numpy.random.seed(0)
N = 100
seasons = ['winter', 'spring', 'summer', 'autumn']
df = pandas.DataFrame({
'season': numpy.random.choice(seasons, size=N),
'A': numpy.random.normal(4, 1.75, size=N),
'B': numpy.random.normal(4, 4.5, size=N),
'C': numpy.random.lognormal(0.5, 0.05, size=N),
'D': numpy.random.beta(3, 1, size=N)
})
print(df.sample(7))
A B C D season
85 7.236212 5.044815 1.845659 0.550943 autumn
13 4.749581 1.014348 1.707000 0.630618 autumn
0 1.014027 4.750031 1.637803 0.285781 winter
3 3.233370 8.250158 1.516189 0.973797 winter
44 6.062864 -0.969725 1.564768 0.954225 autumn
43 7.317806 -3.209259 1.699684 0.968950 spring
39 5.576446 -2.187281 1.735002 0.436692 winter
您可以使用 pandas.melt
函数将其转换为长格式数据。
lf = pandas.melt(df, value_vars=['A', 'B', 'C', 'D'], id_vars='season')
print(lf.sample(7))
season variable value
399 winter D 0.238061
227 spring C 1.656770
322 autumn D 0.933299
121 autumn B 4.393981
6 autumn A 1.175679
5 autumn A 5.360608
51 spring A 5.709118
然后你可以直接将所有内容输入 seaborn.factorplot
fg = (
pandas.melt(df, value_vars=['A', 'B', 'C', 'D'], id_vars='season')
.pipe(
(seaborn.factorplot, 'data'), # (<fxn>, <dataframe var>)
kind='box', # type of plot we want
x='season', x_order=seasons, # x-values of the plots
y='value', palette='BrBG_r', # y-values and colors
col='variable', col_wrap=2, # 'A-D' in columns, wrap at 2nd col
sharey=False # tailor y-axes for each group
notch=True, width=0.75, # kwargs passed to boxplot
)
)
这给了我: