Pandas groupby:创建包含两列的 groupby 时,如何以正确的顺序对工作日进行排序?
Pandas groupby: How to sort weekdays in the correct order when creating groupby with two columns?
以下数据框包含一年中每个小时的值(千瓦时)。
cons2016.head()
Date Hour kWh Month Weekday
0 2016-01-01 00:00 71.48 January Friday
1 2016-01-01 01:00 65.32 January Friday
2 2016-01-01 02:00 65.38 January Friday
3 2016-01-01 03:00 62.44 January Friday
4 2016-01-01 04:00 57.56 January Friday
我想从这个数据框创建一个 Seaborn 热图(工作日在垂直轴上 正确 顺序和水平轴上的小时)。所以我分组:
weekdayhour = cons2016.groupby(["Weekday", "Hour"]).mean()
weekdayhour = weekdayhour.reset_index()
weekdayhour.head()
Weekday Hour kWh
0 Friday 00:00 61.188113
1 Friday 01:00 57.231698
2 Friday 02:00 55.818679
3 Friday 03:00 55.074151
4 Friday 04:00 55.049811
但现在工作日按字母顺序排列(也在热图中):
heat_weekdayhour = weekdayhour.pivot(index="Weekday", columns="Hour", values="kWh")
sns.heatmap(heat_weekdayhour)
如何按正常顺序获取工作日,从周一到周日?我尝试像这样添加 .reindex:
weekdays = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
weekdayhour = cons2016.groupby(["Weekday", "Hour"]).mean().reindex(labels=weekdays)
但这给了我 TypeError: Expected tuple, got str
感谢您的帮助!
使用Categorical
weekdays = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
weekdayhour.Weekday = pd.Categorical(weekdayhour.Weekday,categories=weekdays)
weekdayhour = weekdayhour.sort_values('Weekday')
Weekday Hour kWh
0 Friday 00:00 71.48
1 Friday 01:00 65.32
2 Friday 02:00 65.38
3 Friday 03:00 62.44
4 Friday 04:00 57.56
更多信息:
weekdayhour.Weekday
0 Friday
1 Friday
2 Friday
3 Friday
4 Friday
Name: Weekday, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
import pandas as pd
#You first create your list in the order you want it
days = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
#Using Categorical() function to set the order according to how it is arranged above
df["DOTW_Appointment"] = pd.Categorical(df.DOTW_Appointment, categories=days, ordered=True)
以下数据框包含一年中每个小时的值(千瓦时)。
cons2016.head()
Date Hour kWh Month Weekday
0 2016-01-01 00:00 71.48 January Friday
1 2016-01-01 01:00 65.32 January Friday
2 2016-01-01 02:00 65.38 January Friday
3 2016-01-01 03:00 62.44 January Friday
4 2016-01-01 04:00 57.56 January Friday
我想从这个数据框创建一个 Seaborn 热图(工作日在垂直轴上 正确 顺序和水平轴上的小时)。所以我分组:
weekdayhour = cons2016.groupby(["Weekday", "Hour"]).mean()
weekdayhour = weekdayhour.reset_index()
weekdayhour.head()
Weekday Hour kWh
0 Friday 00:00 61.188113
1 Friday 01:00 57.231698
2 Friday 02:00 55.818679
3 Friday 03:00 55.074151
4 Friday 04:00 55.049811
但现在工作日按字母顺序排列(也在热图中):
heat_weekdayhour = weekdayhour.pivot(index="Weekday", columns="Hour", values="kWh")
sns.heatmap(heat_weekdayhour)
如何按正常顺序获取工作日,从周一到周日?我尝试像这样添加 .reindex:
weekdays = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
weekdayhour = cons2016.groupby(["Weekday", "Hour"]).mean().reindex(labels=weekdays)
但这给了我 TypeError: Expected tuple, got str
感谢您的帮助!
使用Categorical
weekdays = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
weekdayhour.Weekday = pd.Categorical(weekdayhour.Weekday,categories=weekdays)
weekdayhour = weekdayhour.sort_values('Weekday')
Weekday Hour kWh
0 Friday 00:00 71.48
1 Friday 01:00 65.32
2 Friday 02:00 65.38
3 Friday 03:00 62.44
4 Friday 04:00 57.56
更多信息:
weekdayhour.Weekday
0 Friday
1 Friday
2 Friday
3 Friday
4 Friday
Name: Weekday, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
import pandas as pd
#You first create your list in the order you want it
days = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
#Using Categorical() function to set the order according to how it is arranged above
df["DOTW_Appointment"] = pd.Categorical(df.DOTW_Appointment, categories=days, ordered=True)