R reshape2 dcast:转换数据
R reshape2 dcast: transform data
如何将数据 X 转换为 Y,如
X = data.frame(
ID = c(1,1,1,2,2),
NAME = c("MIKE","MIKE","MIKE","LUCY","LUCY"),
SEX = c("MALE","MALE","MALE","FEMALE","FEMALE"),
TEST = c(1,2,3,1,2),
SCORE = c(70,80,90,65,75)
)
Y = data.frame(
ID = c(1,2),
NAME = c("MIKE","LUCY"),
SEX = c("MALE","FEMALE"),
TEST_1 =c(70,65),
TEST_2 =c(80,75),
TEST_3 =c(90,NA)
)
reshape2
中的 dcast
函数似乎可以工作,但它不能在数据中包含其他列,如上例中的 ID、NAME 和 SEX。
假设ID列的所有其他列都一致,比如Mike只能是ID为1的男性,怎么办?
根据文档 (?reshape2::dcast
),dcast()
允许公式中的 ...
:
"..." represents all other variables not used in the formula ...
reshape2
和 data.table
软件包均支持 dcast()
。
所以,你可以这样写:
reshape2::dcast(X, ... ~ TEST, value.var = "SCORE")
# ID NAME SEX 1 2 3
#1 1 MIKE MALE 70 80 90
#2 2 LUCY FEMALE 65 75 NA
但是,如果OP坚持列名应该是TEST_1
、TEST_2
等,则TEST
列需要在整形前修改。这里使用了data.table
:
library(data.table)
dcast(setDT(X)[, TEST := paste0("TEST_", TEST)], ... ~ TEST, value.var = "SCORE")
# ID NAME SEX TEST_1 TEST_2 TEST_3
#1: 1 MIKE MALE 70 80 90
#2: 2 LUCY FEMALE 65 75 NA
这与 data.frame Y
.
给出的预期答案一致
如何将数据 X 转换为 Y,如
X = data.frame(
ID = c(1,1,1,2,2),
NAME = c("MIKE","MIKE","MIKE","LUCY","LUCY"),
SEX = c("MALE","MALE","MALE","FEMALE","FEMALE"),
TEST = c(1,2,3,1,2),
SCORE = c(70,80,90,65,75)
)
Y = data.frame(
ID = c(1,2),
NAME = c("MIKE","LUCY"),
SEX = c("MALE","FEMALE"),
TEST_1 =c(70,65),
TEST_2 =c(80,75),
TEST_3 =c(90,NA)
)
reshape2
中的 dcast
函数似乎可以工作,但它不能在数据中包含其他列,如上例中的 ID、NAME 和 SEX。
假设ID列的所有其他列都一致,比如Mike只能是ID为1的男性,怎么办?
根据文档 (?reshape2::dcast
),dcast()
允许公式中的 ...
:
"..." represents all other variables not used in the formula ...
reshape2
和 data.table
软件包均支持 dcast()
。
所以,你可以这样写:
reshape2::dcast(X, ... ~ TEST, value.var = "SCORE")
# ID NAME SEX 1 2 3
#1 1 MIKE MALE 70 80 90
#2 2 LUCY FEMALE 65 75 NA
但是,如果OP坚持列名应该是TEST_1
、TEST_2
等,则TEST
列需要在整形前修改。这里使用了data.table
:
library(data.table)
dcast(setDT(X)[, TEST := paste0("TEST_", TEST)], ... ~ TEST, value.var = "SCORE")
# ID NAME SEX TEST_1 TEST_2 TEST_3
#1: 1 MIKE MALE 70 80 90
#2: 2 LUCY FEMALE 65 75 NA
这与 data.frame Y
.