在 R 和 Tidyverse 中转换不规则日期

Converting irregular dates in R and the Tidyverse

我有一系列日期如下

25 September 2019
27 April 2020
1994
28 February 2021
1986

现在我想将 1994 和 1996 转换为:

01 January 1994
01 January 1986

其他完整日期应保持原样。

感谢任何帮助,尤其是使用 tidyverse 方式。

给定日期和年份的向量 d

> d
[1] "25 September 2019" "27 April 2020"     "1994"             
[4] "28 February 2021"  "1986"             

将所有只有 4 个字母的条目替换为在前面粘贴“01 January”的这四个字母:

> d[nchar(d)==4] = paste0("01 January ",d[nchar(d)==4])

给予:

> d
[1] "25 September 2019" "27 April 2020"     "01 January 1994"  
[4] "28 February 2021"  "01 January 1986"  

正则表达式解决方案,它使用锚点 ^(对于字符串开始位置)和 $(对于字符串结束位置)以及反向引用来标识“only-year”值\1 回忆“only-year”值:

library(dplyr)
df %>%
  mutate(dates = sub("^(\d{4})$", "01 January \1", dates))
              dates
1 25 September 2019
2     27 April 2020
3   01 January 1994
4  28 February 2021
5   01 January 1986

base R:

df$dates <- sub("^(\d{4})$", "01 January \1", df$dates)

数据:

df <- data.frame(
  dates = c("25 September 2019",
            "27 April 2020",
            "1994",
            "28 February 2021",
            "1986")
)