将日期时间列（包含一个字符）拆分为 R 中的两个单独的列

Question

我有一个包含 date-time 组合列的数据集，我想将其拆分为单独的 year、month、day 和 time 列。我通常使用带有适当参数的 lubridate 库，但是这个特定的列在每一行中也有一个字符 T。

如何通过从该列的每一行删除字符 T 来拆分该列？

Date_Time
2020-01-01T00:48:00  
2020-01-01T00:46:00
2020-01-02T15:07:00
2020-01-02T15:07:00

Answer 1

你可以使用tidyr::separate-

tidyr::separate(df, Date_Time, c('Year', 'Month', 'Day', 'Time'), sep = '[T-]')

#  Year Month Day     Time
#1 2020    01  01 00:48:00
#2 2020    01  01 00:46:00
#3 2020    01  02 15:07:00
#4 2020    01  02 15:07:00

或者将Date_Time转换为POSIXct类型后提取日期和时间。

library(dplyr)
library(lubridate)


df %>%
  mutate(Date_Time  = ymd_hms(Date_Time), 
         Year = year(Date_Time), 
         Month = month(Date_Time), 
         Day = day(Date_Time),
         Time = format(Date_Time, '%T'))

Answer 2

基础 R 解决方案：

cbind(
  df, 
  strcapture(
    pattern = "^(\d{4})-(\d{2})-(\d{2})T(.*)$",
    x = df$Date_Time,
    proto = list(
      year = integer(), 
      month = integer(), 
      day = integer(),
      time = character()
    )
  )
)

将日期时间列（包含一个字符）拆分为 R 中的两个单独的列

Split Date-Time column (containing a character) into two separate columns in R

r

lubridate

dplyr