无法以 "dd/mm/yyyy" 或 "dd/m/yyyy" 格式从数据框的 "Date" 列的三个单独列中提取日期、月份和年份
unable to extract date, month and year in three separate columns from dataframe's "Date" column in the format "dd/mm/yyyy" or "dd/m/yyyy"
我正在尝试使用
library(dplyr)
library(tidyr)
library(stringr)
# Dataframe has "Date" column and date in the format "dd/mm/yyyy" or "dd/m/yyyy"
df <- data.frame(Date = c("10/1/2001", "15/01/2010", "15/2/2010", "20/02/2010", "25/3/2010", "31/03/2010"))
# extract into three columns
df %>% extract(Date, c("Day", "Month", "Year"), "([^/]+), ([^/]+), ([^)]+)")
但上面的代码正在返回:
Day Month Year
1 <NA> <NA> <NA>
2 <NA> <NA> <NA>
3 <NA> <NA> <NA>
4 <NA> <NA> <NA>
5 <NA> <NA> <NA>
6 <NA> <NA> <NA>
如何按预期正确提取结果中的日期:
Day Month Year
1 10 1 2010
2 15 1 2010
3 15 2 2010
4 20 2 2010
5 25 3 2010
6 31 3 2010
您的正则表达式模式已关闭。使用此版本:
df %>% extract(Date, c("Day", "Month", "Year"), "(\d+)/(\d+)/(\d+)")
在这种情况下separate
可能更容易使用
df %>%
separate("Date", into=c("Day","Month","Year"), sep="/") %>%
mutate(Month=str_replace(Month, "^0",""))
这会将所有内容保留为字符值。如果您希望值是数字,请使用
df %>%
separate("Date", into=c("Day","Month","Year"), sep="/", convert=TRUE)
我们可以使用 lubridate
:
library(lubridate)
library(dplyr)
df %>%
mutate(Date = dmy(Date), # if your Date column is character type
across(Date, funs(year, month, day)))
Date Date_year Date_month Date_day
1 2001-01-10 2001 1 10
2 2010-01-15 2010 1 15
3 2010-02-15 2010 2 15
4 2010-02-20 2010 2 20
5 2010-03-25 2010 3 25
6 2010-03-31 2010 3 31
我们可以使用 read.table
来自 base R
read.table(text = df$Date, sep="/", header = FALSE,
col.names = c("Day", "Month", "Year"))
Day Month Year
1 10 1 2001
2 15 1 2010
3 15 2 2010
4 20 2 2010
5 25 3 2010
6 31 3 2010
我正在尝试使用
library(dplyr)
library(tidyr)
library(stringr)
# Dataframe has "Date" column and date in the format "dd/mm/yyyy" or "dd/m/yyyy"
df <- data.frame(Date = c("10/1/2001", "15/01/2010", "15/2/2010", "20/02/2010", "25/3/2010", "31/03/2010"))
# extract into three columns
df %>% extract(Date, c("Day", "Month", "Year"), "([^/]+), ([^/]+), ([^)]+)")
但上面的代码正在返回:
Day Month Year
1 <NA> <NA> <NA>
2 <NA> <NA> <NA>
3 <NA> <NA> <NA>
4 <NA> <NA> <NA>
5 <NA> <NA> <NA>
6 <NA> <NA> <NA>
如何按预期正确提取结果中的日期:
Day Month Year
1 10 1 2010
2 15 1 2010
3 15 2 2010
4 20 2 2010
5 25 3 2010
6 31 3 2010
您的正则表达式模式已关闭。使用此版本:
df %>% extract(Date, c("Day", "Month", "Year"), "(\d+)/(\d+)/(\d+)")
在这种情况下separate
可能更容易使用
df %>%
separate("Date", into=c("Day","Month","Year"), sep="/") %>%
mutate(Month=str_replace(Month, "^0",""))
这会将所有内容保留为字符值。如果您希望值是数字,请使用
df %>%
separate("Date", into=c("Day","Month","Year"), sep="/", convert=TRUE)
我们可以使用 lubridate
:
library(lubridate)
library(dplyr)
df %>%
mutate(Date = dmy(Date), # if your Date column is character type
across(Date, funs(year, month, day)))
Date Date_year Date_month Date_day
1 2001-01-10 2001 1 10
2 2010-01-15 2010 1 15
3 2010-02-15 2010 2 15
4 2010-02-20 2010 2 20
5 2010-03-25 2010 3 25
6 2010-03-31 2010 3 31
我们可以使用 read.table
来自 base R
read.table(text = df$Date, sep="/", header = FALSE,
col.names = c("Day", "Month", "Year"))
Day Month Year
1 10 1 2001
2 15 1 2010
3 15 2 2010
4 20 2 2010
5 25 3 2010
6 31 3 2010