将 "january 2020" 字符串解析为日期格式 returns R 中 NA 的一半
Parse "january 2020" string to date format returns half of NA's in R
我有一个包含 1968 年观测值的数据框,我正在尝试解析日期列,其中我将字符串格式转换为日期格式。像这样:
df$date <- c("january 2020","january 2020","january 2020","february 2020","february 2020","february 2020","march 2020","march 2020","march 2020","april 2020","april 2020","april 2020","May 2020","May 2020","May 2020","june 2020","june 2020","june 2020")
我正在使用 lubridate 包:
date <- my(df$date)
这会带来“857 解析失败”警告和 returns 像这样的 vactor:
[1] NA NA NA NA NA NA 2020-03-01 2020-03-01 2020-03-01 NA NA NA NA NA NA 2020-06-01 2020-06-01 2020-06-01 2020-06-01
虽然我想要这种格式的日期,ymd,但我希望对所有观察结果进行解析。我也试过:
date <- as.Date(df$date)
date <- my(df$date, format = "%B %Y)
但是这些 returns 所有观察结果都是 NA。发生了什么事?
谢谢
as.Date(paste(1, df$date), '%d %B %Y')
my
来自 lubridate
包应该像这样工作:
library(dplyr)
library(lubridate)
df %>%
mutate(my_date = my(date))
1 2020-01-01
2 2020-01-01
3 2020-01-01
4 2020-02-01
5 2020-02-01
6 2020-02-01
7 2020-03-01
8 2020-03-01
9 2020-03-01
10 2020-04-01
11 2020-04-01
12 2020-04-01
13 2020-05-01
14 2020-05-01
15 2020-05-01
16 2020-06-01
17 2020-06-01
18 2020-06-01
或:
我们可以使用 parse_date_time
from lubridate
:
format(lubridate::parse_date_time(df$my_date, orders = c("m/Y")), "%m-%Y")
[1] "01-2020" "01-2020" "01-2020" "02-2020" "02-2020" "02-2020" "03-2020"
[8] "03-2020" "03-2020" "04-2020" "04-2020" "04-2020" "05-2020" "05-2020"
[15] "05-2020" "06-2020" "06-2020" "06-2020"
数据:
df <- structure(list(my_date = c("january 2020", "january 2020", "january 2020",
"february 2020", "february 2020", "february 2020", "march 2020",
"march 2020", "march 2020", "april 2020", "april 2020", "april 2020",
"May 2020", "May 2020", "May 2020", "june 2020", "june 2020",
"june 2020")), class = "data.frame", row.names = c(NA, -18L))
我有一个包含 1968 年观测值的数据框,我正在尝试解析日期列,其中我将字符串格式转换为日期格式。像这样:
df$date <- c("january 2020","january 2020","january 2020","february 2020","february 2020","february 2020","march 2020","march 2020","march 2020","april 2020","april 2020","april 2020","May 2020","May 2020","May 2020","june 2020","june 2020","june 2020")
我正在使用 lubridate 包:
date <- my(df$date)
这会带来“857 解析失败”警告和 returns 像这样的 vactor:
[1] NA NA NA NA NA NA 2020-03-01 2020-03-01 2020-03-01 NA NA NA NA NA NA 2020-06-01 2020-06-01 2020-06-01 2020-06-01
虽然我想要这种格式的日期,ymd,但我希望对所有观察结果进行解析。我也试过:
date <- as.Date(df$date)
date <- my(df$date, format = "%B %Y)
但是这些 returns 所有观察结果都是 NA。发生了什么事?
谢谢
as.Date(paste(1, df$date), '%d %B %Y')
my
来自 lubridate
包应该像这样工作:
library(dplyr)
library(lubridate)
df %>%
mutate(my_date = my(date))
1 2020-01-01
2 2020-01-01
3 2020-01-01
4 2020-02-01
5 2020-02-01
6 2020-02-01
7 2020-03-01
8 2020-03-01
9 2020-03-01
10 2020-04-01
11 2020-04-01
12 2020-04-01
13 2020-05-01
14 2020-05-01
15 2020-05-01
16 2020-06-01
17 2020-06-01
18 2020-06-01
或:
我们可以使用 parse_date_time
from lubridate
:
format(lubridate::parse_date_time(df$my_date, orders = c("m/Y")), "%m-%Y")
[1] "01-2020" "01-2020" "01-2020" "02-2020" "02-2020" "02-2020" "03-2020"
[8] "03-2020" "03-2020" "04-2020" "04-2020" "04-2020" "05-2020" "05-2020"
[15] "05-2020" "06-2020" "06-2020" "06-2020"
数据:
df <- structure(list(my_date = c("january 2020", "january 2020", "january 2020",
"february 2020", "february 2020", "february 2020", "march 2020",
"march 2020", "march 2020", "april 2020", "april 2020", "april 2020",
"May 2020", "May 2020", "May 2020", "june 2020", "june 2020",
"june 2020")), class = "data.frame", row.names = c(NA, -18L))