如何在 r 中将日期和时间列拆分为单独的日、月、小时、分钟、秒、星期几列?
How to split a date and time column into separate day, month, hour, minute, second, day of week columns in r?
我正在尝试将日期和时间列拆分为单独的日、月、时、分、秒、星期几列。我正在使用 lubridate 和 mutuate 函数,但是当我尝试使用以下代码时出现此错误:警告消息:所有格式都无法解析。未找到格式。
我的新专栏已创建,但它们都包含 NA - 想知道是否有人可以提供帮助?
我的专栏如下所示:
tpep_pickup_datetime
01/07/2019 00:51:15
01/07/2019 00:46:30
01/07/2019 00:25:35
我的代码是这样的:
taxidata3 <- taxidata2 %>%
mutate(tpep_pickup_datetime = mdy_hms(tpep_pickup_datetime),
day = day(tpep_pickup_datetime),
month = month(tpep_pickup_datetime),
year = year(tpep_pickup_datetime),
dayofweek = wday(tpep_pickup_datetime),
hour = hour(tpep_pickup_datetime),
minute = minute(tpep_pickup_datetime),
second = second(tpep_pickup_datetime))
根据 OP 的评论,日期格式是 day/month/... 而不是 month/day/... 这里,我们需要 dmy_hms
。所以,每个字母表示出现的顺序
library(lubridate)
library(dplyr)
taxidata3 <- taxidata2 %>%
mutate(tpep_pickup_datetime = dmy_hms(tpep_pickup_datetime),
day = day(tpep_pickup_datetime),
month = month(tpep_pickup_datetime),
year = year(tpep_pickup_datetime),
dayofweek = wday(tpep_pickup_datetime),
hour = hour(tpep_pickup_datetime),
minute = minute(tpep_pickup_datetime),
second = second(tpep_pickup_datetime))
这是一个使用正则表达式匹配日期组件的stringr
解决方案:
数据:
df <- data.frame(
tpep_pickup_datetime = c("01/07/2019 00:51:15", "01/07/2019 00:46:30", "01/07/2019 00:25:35")
)
解决方案:
library(stringr)
df$day <- str_extract(df$tpep_pickup_datetime, "^\d{2}")
df$month <- str_extract(df$tpep_pickup_datetime, "(?<=/)\d{2}")
df$year <- str_extract(df$tpep_pickup_datetime, "\d{4}")
df$hour <- str_extract(df$tpep_pickup_datetime, "(?<= )\d{2}(?=:)")
df$minute <- str_extract(df$tpep_pickup_datetime, "(?<=:)\d{2}(?=:)")
df$second <- str_extract(df$tpep_pickup_datetime, "(?<=:)\d{2}$")
结果:
df
tpep_pickup_datetime day month year hour minute second
1 01/07/2019 00:51:15 01 07 2019 00 51 15
2 01/07/2019 00:46:30 01 07 2019 00 46 30
3 01/07/2019 00:25:35 01 07 2019 00 25 35
这是使用 separate
函数的另一种选择。代码和输出如下:-
library(tidyverse)
df <- data.frame(
tpep_pickup_datetime = c("01/07/2019 00:51:15", "01/07/2019 00:46:30",
"01/07/2019 00:25:35"))
df %>%
separate(tpep_pickup_datetime, c("Day", "Month", "Year_time"),
sep = "/", remove = FALSE) %>%
separate(Year_time, c("Year", "Time"),
sep = " ", remove = TRUE) %>%
separate(Time, c("Hour", "Minute", "Second"),
sep = ":", remove = TRUE)
# tpep_pickup_datetime Day Month Year Hour Minute Second
#1 01/07/2019 00:51:15 01 07 2019 00 51 15
#2 01/07/2019 00:46:30 01 07 2019 00 46 30
#3 01/07/2019 00:25:35 01 07 2019 00 25 35
我正在尝试将日期和时间列拆分为单独的日、月、时、分、秒、星期几列。我正在使用 lubridate 和 mutuate 函数,但是当我尝试使用以下代码时出现此错误:警告消息:所有格式都无法解析。未找到格式。
我的新专栏已创建,但它们都包含 NA - 想知道是否有人可以提供帮助?
我的专栏如下所示:
tpep_pickup_datetime
01/07/2019 00:51:15
01/07/2019 00:46:30
01/07/2019 00:25:35
我的代码是这样的:
taxidata3 <- taxidata2 %>%
mutate(tpep_pickup_datetime = mdy_hms(tpep_pickup_datetime),
day = day(tpep_pickup_datetime),
month = month(tpep_pickup_datetime),
year = year(tpep_pickup_datetime),
dayofweek = wday(tpep_pickup_datetime),
hour = hour(tpep_pickup_datetime),
minute = minute(tpep_pickup_datetime),
second = second(tpep_pickup_datetime))
根据 OP 的评论,日期格式是 day/month/... 而不是 month/day/... 这里,我们需要 dmy_hms
。所以,每个字母表示出现的顺序
library(lubridate)
library(dplyr)
taxidata3 <- taxidata2 %>%
mutate(tpep_pickup_datetime = dmy_hms(tpep_pickup_datetime),
day = day(tpep_pickup_datetime),
month = month(tpep_pickup_datetime),
year = year(tpep_pickup_datetime),
dayofweek = wday(tpep_pickup_datetime),
hour = hour(tpep_pickup_datetime),
minute = minute(tpep_pickup_datetime),
second = second(tpep_pickup_datetime))
这是一个使用正则表达式匹配日期组件的stringr
解决方案:
数据:
df <- data.frame(
tpep_pickup_datetime = c("01/07/2019 00:51:15", "01/07/2019 00:46:30", "01/07/2019 00:25:35")
)
解决方案:
library(stringr)
df$day <- str_extract(df$tpep_pickup_datetime, "^\d{2}")
df$month <- str_extract(df$tpep_pickup_datetime, "(?<=/)\d{2}")
df$year <- str_extract(df$tpep_pickup_datetime, "\d{4}")
df$hour <- str_extract(df$tpep_pickup_datetime, "(?<= )\d{2}(?=:)")
df$minute <- str_extract(df$tpep_pickup_datetime, "(?<=:)\d{2}(?=:)")
df$second <- str_extract(df$tpep_pickup_datetime, "(?<=:)\d{2}$")
结果:
df
tpep_pickup_datetime day month year hour minute second
1 01/07/2019 00:51:15 01 07 2019 00 51 15
2 01/07/2019 00:46:30 01 07 2019 00 46 30
3 01/07/2019 00:25:35 01 07 2019 00 25 35
这是使用 separate
函数的另一种选择。代码和输出如下:-
library(tidyverse)
df <- data.frame(
tpep_pickup_datetime = c("01/07/2019 00:51:15", "01/07/2019 00:46:30",
"01/07/2019 00:25:35"))
df %>%
separate(tpep_pickup_datetime, c("Day", "Month", "Year_time"),
sep = "/", remove = FALSE) %>%
separate(Year_time, c("Year", "Time"),
sep = " ", remove = TRUE) %>%
separate(Time, c("Hour", "Minute", "Second"),
sep = ":", remove = TRUE)
# tpep_pickup_datetime Day Month Year Hour Minute Second
#1 01/07/2019 00:51:15 01 07 2019 00 51 15
#2 01/07/2019 00:46:30 01 07 2019 00 46 30
#3 01/07/2019 00:25:35 01 07 2019 00 25 35