Boxplot 需要在 Pandas 中使用多个 groupby
Boxplot needs to use multiple groupby in Pandas
我正在使用 pandas、Jupyter 笔记本和 python。
我有以下数据集作为数据框
Cars,Country,Type
1564,Australia,Stolen
200,Australia,Stolen
579,Australia,Stolen
156,Japan,Lost
900,Africa,Burnt
2000,USA,Stolen
1000,Indonesia,Stolen
900,Australia,Lost
798,Australia,Lost
128,Australia,Lost
200,Australia,Burnt
56,Australia,Burnt
348,Australia,Burnt
1246,USA,Burnt
我想知道如何使用箱形图来回答以下问题 "Number of cars in Australia that were affected by each type"。所以基本上,我应该有 3 个箱线图(每种类型)显示澳大利亚受影响的汽车数量。
请记住,这是真实数据集的一个子集。
您可以 select 仅 "Country"
列中对应于 "Australia"
的行,并将其按列 "Type"
分组,如下所示:
from StringIO import StringIO
import pandas as pd
text_string = StringIO(
"""
Cars,Country,Type,Score
1564,Australia,Stolen,1
200,Australia,Stolen,2
579,Australia,Stolen,3
156,Japan,Lost,4
900,Africa,Burnt,5
2000,USA,Stolen,6
1000,Indonesia,Stolen,7
900,Australia,Lost,8
798,Australia,Lost,9
128,Australia,Lost,10
200,Australia,Burnt,11
56,Australia,Burnt,12
348,Australia,Burnt,13
1246,USA,Burnt,14
""")
df = pd.read_csv(text_string, sep = ",")
# Specifically checks in column name "Cars"
group = df.loc[df['Country'] == 'Australia'].boxplot(column = 'Cars', by = 'Type')
我正在使用 pandas、Jupyter 笔记本和 python。 我有以下数据集作为数据框
Cars,Country,Type
1564,Australia,Stolen
200,Australia,Stolen
579,Australia,Stolen
156,Japan,Lost
900,Africa,Burnt
2000,USA,Stolen
1000,Indonesia,Stolen
900,Australia,Lost
798,Australia,Lost
128,Australia,Lost
200,Australia,Burnt
56,Australia,Burnt
348,Australia,Burnt
1246,USA,Burnt
我想知道如何使用箱形图来回答以下问题 "Number of cars in Australia that were affected by each type"。所以基本上,我应该有 3 个箱线图(每种类型)显示澳大利亚受影响的汽车数量。
请记住,这是真实数据集的一个子集。
您可以 select 仅 "Country"
列中对应于 "Australia"
的行,并将其按列 "Type"
分组,如下所示:
from StringIO import StringIO
import pandas as pd
text_string = StringIO(
"""
Cars,Country,Type,Score
1564,Australia,Stolen,1
200,Australia,Stolen,2
579,Australia,Stolen,3
156,Japan,Lost,4
900,Africa,Burnt,5
2000,USA,Stolen,6
1000,Indonesia,Stolen,7
900,Australia,Lost,8
798,Australia,Lost,9
128,Australia,Lost,10
200,Australia,Burnt,11
56,Australia,Burnt,12
348,Australia,Burnt,13
1246,USA,Burnt,14
""")
df = pd.read_csv(text_string, sep = ",")
# Specifically checks in column name "Cars"
group = df.loc[df['Country'] == 'Australia'].boxplot(column = 'Cars', by = 'Type')