数据重塑和分组

Data reshape and grouping

我是 R 的新手,请帮忙。 我有一个包含 5 列的数据框,名称为 Seasondate 和 V1、V2、V3、V4。 季节日期有不同的日期格式,大约有 1000 个观察结果,例如:

January to March 
August to October 
05/01/2013 to 10/30/2013
NA
February to June 
02/15/2013 to 06/19/2013

我想将它们全部整合成一种格式。就像将它们以月到月的一种格式全部整合起来。

非常感谢使用字符串函数进行解析

编辑 1:

他们都是2013年 谢谢

使用 as.Dateformat 来回进行一些格式化,然后再次 paste 将其全部合并:

datext <- function(x) {
  dates <- as.Date(x,format="%m/%d/%Y")
  if(all(is.na(dates))) x else format(dates,"%B")
}
vapply(
  lapply(strsplit(as.character(dat$Seasondate), " to "), datext), 
  paste, collapse=" to ", FUN.VALUE=character(1)
)
#[1] "January to March"  "August to October" "May to October"    
#[4] "NA"                "February to June"  "February to June" 

这是另一个不使用日期强制的想法,而是使用来自基础 R 的 month.name 向量。

## change the column to character
df$V1 <- as.character(df$V1)
## find the numeric values
g <- grepl("\d", df$V1)
## split them, get the months, then apply 'month.name' and paste
df$V1[g] <- vapply(strsplit(df$V1[g], " to "), function(x) {
    paste(month.name[as.integer(sub("/.*", "", x))], collapse = " to ")
}, "")

导致

df
                 V1
1  January to March
2 August to October
3    May to October
4              <NA>
5  February to June
6  February to June

原始数据:

df <- structure(list(V1 = structure(c(5L, 3L, 2L, NA, 4L, 1L), .Label = c("02/15/2013 to 06/19/2013", 
"05/01/2013 to 10/30/2013", "August to October", "February to June", 
"January to March"), class = "factor")), .Names = "V1", class = "data.frame", row.names = c(NA, 
-6L))