如何在保持月份顺序的同时将月份转换为因素?

How to convert months as factors while still maintaining the months in sequence?

我有一个包含大约 10 年(1994-2003)数据的原始数据框 (df)。 head(df)如下图:

Sl.no       Date Year Month Season            val1            val2     val3
1     1 1993-12-01 1993   Dec Winter          21.0            16.0      3.0
2     2 1994-01-01 1994   Jan Winter          21.0            15.5      0.0
3     3 1994-02-01 1994   Feb Winter          21.0            18.5      0.0
4     4 1994-03-01 1994   Mar Spring          30.0            24.0      1.9
5     5 1994-04-01 1994   Apr Spring          35.5            27.0      0.5
6     6 1994-05-01 1994   May Spring          36.0            30.0      1.5

因为我想将月份转换为因子,以便绘制箱线图,我使用了:

df$Month <- as.factor(format(df$Date, "%b"))
levels(df$Month) <- c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul",
"Aug", "Sep", "Oct", "Nov", "Dec")

但是输出如下所示:(月份不像原始 df 那样按顺序排列)

Sl.no       Date Year Month Season          val1             val2      val3
1     1 1993-12-01 1993   Mar Winter          21.0            16.0      3.0
2     2 1994-01-01 1994   May Winter          21.0            15.5      0.0
3     3 1994-02-01 1994   Apr Winter          21.0            18.5      0.0
4     4 1994-03-01 1994   Aug Spring          30.0            24.0      1.9
5     5 1994-04-01 1994   Jan Spring          35.5            27.0      0.5
6     6 1994-05-01 1994   Sep Spring          36.0            30.0      1.5

所以在上面的 df 中,注意到月份被扭曲了,否则应该在日期之后按顺序排列。

那么我该如何解决这个问题呢?非常感谢您的帮助。 亲切的问候

使用

df$Month <- factor(format(df$Date, "%b"), month.abb, ordered = TRUE)

您所面临问题的演示:

set.seed(1)
M <- sample(month.abb, 20, TRUE)
M
#  [1] "Apr" "May" "Jul" "Nov" "Mar" "Nov" "Dec" "Aug" "Aug" "Jan" "Mar" "Mar" "Sep" "May"
# [15] "Oct" "Jun" "Sep" "Dec" "May" "Oct"

your_attempt <- as.factor(M)
#  [1] Apr May Jul Nov Mar Nov Dec Aug Aug Jan Mar Mar Sep May Oct Jun Sep Dec May Oct
# Levels: Apr Aug Dec Jan Jul Jun Mar May Nov Oct Sep

## At this step, you're basically asking R to replace "Apr" with "Jan",
##   "Aug" with "Feb", and so on. Not what you're looking for....
levels(your_attempt) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", 
                          "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")

your_attempt
#  [1] Jan Aug May Sep Jul Sep Mar Feb Feb Apr Jul Jul Nov Aug Oct Jun Nov Mar Aug Oct
# Levels: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

## ordered = TRUE not necessarily required. Depends on what you want to do
new_attempt <- factor(M, levels = month.abb, ordered = TRUE)
new_attempt
#  [1] Apr May Jul Nov Mar Nov Dec Aug Aug Jan Mar Mar Sep May Oct Jun Sep Dec May Oct
# Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < Oct < Nov < Dec

lubridate package 中的 month() 函数将为您处理。

library(lubridate)
df$Month <- month(df$Date, label=TRUE, abbr=TRUE)