在 R 和 Tidyverse 中转换不规则日期
Converting irregular dates in R and the Tidyverse
我有一系列日期如下
25 September 2019
27 April 2020
1994
28 February 2021
1986
现在我想将 1994 和 1996 转换为:
01 January 1994
01 January 1986
其他完整日期应保持原样。
感谢任何帮助,尤其是使用 tidyverse 方式。
给定日期和年份的向量 d
:
> d
[1] "25 September 2019" "27 April 2020" "1994"
[4] "28 February 2021" "1986"
将所有只有 4 个字母的条目替换为在前面粘贴“01 January”的这四个字母:
> d[nchar(d)==4] = paste0("01 January ",d[nchar(d)==4])
给予:
> d
[1] "25 September 2019" "27 April 2020" "01 January 1994"
[4] "28 February 2021" "01 January 1986"
正则表达式解决方案,它使用锚点 ^
(对于字符串开始位置)和 $
(对于字符串结束位置)以及反向引用来标识“only-year”值\1
回忆“only-year”值:
library(dplyr)
df %>%
mutate(dates = sub("^(\d{4})$", "01 January \1", dates))
dates
1 25 September 2019
2 27 April 2020
3 01 January 1994
4 28 February 2021
5 01 January 1986
base R
:
df$dates <- sub("^(\d{4})$", "01 January \1", df$dates)
数据:
df <- data.frame(
dates = c("25 September 2019",
"27 April 2020",
"1994",
"28 February 2021",
"1986")
)
我有一系列日期如下
25 September 2019
27 April 2020
1994
28 February 2021
1986
现在我想将 1994 和 1996 转换为:
01 January 1994
01 January 1986
其他完整日期应保持原样。
感谢任何帮助,尤其是使用 tidyverse 方式。
给定日期和年份的向量 d
:
> d
[1] "25 September 2019" "27 April 2020" "1994"
[4] "28 February 2021" "1986"
将所有只有 4 个字母的条目替换为在前面粘贴“01 January”的这四个字母:
> d[nchar(d)==4] = paste0("01 January ",d[nchar(d)==4])
给予:
> d
[1] "25 September 2019" "27 April 2020" "01 January 1994"
[4] "28 February 2021" "01 January 1986"
正则表达式解决方案,它使用锚点 ^
(对于字符串开始位置)和 $
(对于字符串结束位置)以及反向引用来标识“only-year”值\1
回忆“only-year”值:
library(dplyr)
df %>%
mutate(dates = sub("^(\d{4})$", "01 January \1", dates))
dates
1 25 September 2019
2 27 April 2020
3 01 January 1994
4 28 February 2021
5 01 January 1986
base R
:
df$dates <- sub("^(\d{4})$", "01 January \1", df$dates)
数据:
df <- data.frame(
dates = c("25 September 2019",
"27 April 2020",
"1994",
"28 February 2021",
"1986")
)