年序与组
Year Sequence with group
我想创建一个新列,一个从 2003 年到 2006 年的序列年,每个组都有。
# dt
NAME ID col3
AAA 1 SB
ABC 2 LA
CCC 3 AL
我要的是:
NAME ID col3 Year
AAA 1 SB 2003
AAA 1 SB 2004
AAA 1 SB 2005
AAA 1 SB 2006
ABC 2 LA 2003
ABC 2 LA 2004
ABC 2 LA 2005
ABC 2 LA 2006
CCC 3 AL 2003
CCC 3 AL 2004
CCC 3 AL 2005
CCC 3 AL 2006
我试过这个:
dt[rep(1:.N, 4)][, Year := seq(2003, 2006), by = .(NAME, ID)]
我得到了结果。我想知道的是对此有更好的解决方案吗?
基于 Tidyverse 的解决方案
dt <- data.frame("NAME"= c("AAA","BBB","CCC"),
"ID"= c(1,2,3),
"col3" = c("SB","LA","AL"))
library(tidyverse)
#> Warning: package 'tibble' was built under R version 3.5.2
dt %>%
group_by(NAME,ID,col3) %>%
expand(Year = seq(2003, 2006))
#> # A tibble: 12 x 4
#> # Groups: NAME, ID, col3 [3]
#> NAME ID col3 Year
#> <fct> <dbl> <fct> <int>
#> 1 AAA 1 SB 2003
#> 2 AAA 1 SB 2004
#> 3 AAA 1 SB 2005
#> 4 AAA 1 SB 2006
#> 5 BBB 2 LA 2003
#> 6 BBB 2 LA 2004
#> 7 BBB 2 LA 2005
#> 8 BBB 2 LA 2006
#> 9 CCC 3 AL 2003
#> 10 CCC 3 AL 2004
#> 11 CCC 3 AL 2005
#> 12 CCC 3 AL 2006
由 reprex package (v0.2.1)
于 2019-01-24 创建
expand()
documentation
使用 data.table
s 你可以做到
dt[, .(Year = seq(2003, 2006)), by = .(NAME, ID, col3)]
# NAME ID col3 Year
#1: AAA 1 SB 2003
#2: AAA 1 SB 2004
#3: AAA 1 SB 2005
#4: AAA 1 SB 2006
#5: ABC 2 LA 2003
#6: ABC 2 LA 2004
#7: ABC 2 LA 2005
#8: ABC 2 LA 2006
#9: CCC 3 AL 2003
#10: CCC 3 AL 2004
#11: CCC 3 AL 2005
#12: CCC 3 AL 2006
这里的 .(...)
表达式是 shorthand for list(...)
作为 j
参数。
示例数据
dt <- fread("NAME ID col3
AAA 1 SB
ABC 2 LA
CCC 3 AL")
这是另一个选项 crossing
library(tidyr)
crossing(dt, Year = 2003:2006)
# NAME ID col3 Year
#1 AAA 1 SB 2003
#2 AAA 1 SB 2004
#3 AAA 1 SB 2005
#4 AAA 1 SB 2006
#5 BBB 2 LA 2003
#6 BBB 2 LA 2004
#7 BBB 2 LA 2005
#8 BBB 2 LA 2006
#9 CCC 3 AL 2003
#10 CCC 3 AL 2004
#11 CCC 3 AL 2005
#12 CCC 3 AL 2006
数据
dt <- structure(list(NAME = structure(1:3, .Label = c("AAA", "BBB",
"CCC"), class = "factor"), ID = c(1, 2, 3), col3 = structure(3:1, .Label = c("AL",
"LA", "SB"), class = "factor")), class = "data.frame", row.names = c(NA,
-3L))
我想创建一个新列,一个从 2003 年到 2006 年的序列年,每个组都有。
# dt
NAME ID col3
AAA 1 SB
ABC 2 LA
CCC 3 AL
我要的是:
NAME ID col3 Year
AAA 1 SB 2003
AAA 1 SB 2004
AAA 1 SB 2005
AAA 1 SB 2006
ABC 2 LA 2003
ABC 2 LA 2004
ABC 2 LA 2005
ABC 2 LA 2006
CCC 3 AL 2003
CCC 3 AL 2004
CCC 3 AL 2005
CCC 3 AL 2006
我试过这个:
dt[rep(1:.N, 4)][, Year := seq(2003, 2006), by = .(NAME, ID)]
我得到了结果。我想知道的是对此有更好的解决方案吗?
基于 Tidyverse 的解决方案
dt <- data.frame("NAME"= c("AAA","BBB","CCC"),
"ID"= c(1,2,3),
"col3" = c("SB","LA","AL"))
library(tidyverse)
#> Warning: package 'tibble' was built under R version 3.5.2
dt %>%
group_by(NAME,ID,col3) %>%
expand(Year = seq(2003, 2006))
#> # A tibble: 12 x 4
#> # Groups: NAME, ID, col3 [3]
#> NAME ID col3 Year
#> <fct> <dbl> <fct> <int>
#> 1 AAA 1 SB 2003
#> 2 AAA 1 SB 2004
#> 3 AAA 1 SB 2005
#> 4 AAA 1 SB 2006
#> 5 BBB 2 LA 2003
#> 6 BBB 2 LA 2004
#> 7 BBB 2 LA 2005
#> 8 BBB 2 LA 2006
#> 9 CCC 3 AL 2003
#> 10 CCC 3 AL 2004
#> 11 CCC 3 AL 2005
#> 12 CCC 3 AL 2006
由 reprex package (v0.2.1)
于 2019-01-24 创建expand()
documentation
使用 data.table
s 你可以做到
dt[, .(Year = seq(2003, 2006)), by = .(NAME, ID, col3)]
# NAME ID col3 Year
#1: AAA 1 SB 2003
#2: AAA 1 SB 2004
#3: AAA 1 SB 2005
#4: AAA 1 SB 2006
#5: ABC 2 LA 2003
#6: ABC 2 LA 2004
#7: ABC 2 LA 2005
#8: ABC 2 LA 2006
#9: CCC 3 AL 2003
#10: CCC 3 AL 2004
#11: CCC 3 AL 2005
#12: CCC 3 AL 2006
这里的 .(...)
表达式是 shorthand for list(...)
作为 j
参数。
示例数据
dt <- fread("NAME ID col3
AAA 1 SB
ABC 2 LA
CCC 3 AL")
这是另一个选项 crossing
library(tidyr)
crossing(dt, Year = 2003:2006)
# NAME ID col3 Year
#1 AAA 1 SB 2003
#2 AAA 1 SB 2004
#3 AAA 1 SB 2005
#4 AAA 1 SB 2006
#5 BBB 2 LA 2003
#6 BBB 2 LA 2004
#7 BBB 2 LA 2005
#8 BBB 2 LA 2006
#9 CCC 3 AL 2003
#10 CCC 3 AL 2004
#11 CCC 3 AL 2005
#12 CCC 3 AL 2006
数据
dt <- structure(list(NAME = structure(1:3, .Label = c("AAA", "BBB",
"CCC"), class = "factor"), ID = c(1, 2, 3), col3 = structure(3:1, .Label = c("AL",
"LA", "SB"), class = "factor")), class = "data.frame", row.names = c(NA,
-3L))