按照 R 中的 "tidy" 方法重组具有多种 header 类型的数据框
Reorganizing dataframe with multiple header types following "tidy" approach in R
我有一个看起来像这样的数据框:
Age A1U_sweet A2F_dip A3U_bbq C1U_sweet C2F_dip C3U_bbq Comments
23 1 2 1 NA NA NA Good
54 NA NA NA 4 1 2 ABCD
43 2 4 7 NA NA NA HiHi
我正在尝试按照下面所示的方式重新组织它,使其更加 "tidy"。有没有一种方法可以让我做到这一点,它还以与下面其他变量所示相同的样式合并了年龄和评论列?你会如何建议合并它们 - 一个想法如下所示,但我愿意接受其他建议。我将如何修改以下代码以考虑多种不同样式的列名?
library(tidyr)
df <- data.frame(id = 1:nrow(df), df)
dfl <- gather(df, key = "key", value = "value", -id)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2
## id kind type A C
## 1 1 Age Age 23 23
## 2 1 1U_ sweet 1 NA
## 3 1 2F_ dip 2 NA
## 4 1 3U_ bbq 1 NA
## 5 1 Com Com Good Good
## 6 2 Age Age 54 54
## 7 2 1U_ sweet NA 4
## 8 2 2F_ dip NA 1
## 9 2 3U_ bbq NA 2
##10 2 Com Com ABCD ABCD
##11 3 Age Age 43 43
##12 3 1U_ sweet 2 NA
##13 3 2F_ dip 4 NA
##14 3 3U_ bbq 7 NA
##15 3 Com Com HiHi HiHi
我如何修改以下代码以 return 数据恢复到原来的状态?
df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)
对于上下文,这个问题是由 Ista 在这个问题下的评论提示的:
由于 Age
和 Comments
大概是在原始数据中的任何一行的水平上测量的,所以只需带上它们:
df <- data.frame(id = 1:nrow(df), df)
dfl <- gather(df, key = "key", value = "value", -id, -Age, -Comments)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2
df2 <- transform(df2, B = ifelse(is.na(A), C, A))
df2
df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)
df
我有一个看起来像这样的数据框:
Age A1U_sweet A2F_dip A3U_bbq C1U_sweet C2F_dip C3U_bbq Comments
23 1 2 1 NA NA NA Good
54 NA NA NA 4 1 2 ABCD
43 2 4 7 NA NA NA HiHi
我正在尝试按照下面所示的方式重新组织它,使其更加 "tidy"。有没有一种方法可以让我做到这一点,它还以与下面其他变量所示相同的样式合并了年龄和评论列?你会如何建议合并它们 - 一个想法如下所示,但我愿意接受其他建议。我将如何修改以下代码以考虑多种不同样式的列名?
library(tidyr)
df <- data.frame(id = 1:nrow(df), df)
dfl <- gather(df, key = "key", value = "value", -id)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2
## id kind type A C
## 1 1 Age Age 23 23
## 2 1 1U_ sweet 1 NA
## 3 1 2F_ dip 2 NA
## 4 1 3U_ bbq 1 NA
## 5 1 Com Com Good Good
## 6 2 Age Age 54 54
## 7 2 1U_ sweet NA 4
## 8 2 2F_ dip NA 1
## 9 2 3U_ bbq NA 2
##10 2 Com Com ABCD ABCD
##11 3 Age Age 43 43
##12 3 1U_ sweet 2 NA
##13 3 2F_ dip 4 NA
##14 3 3U_ bbq 7 NA
##15 3 Com Com HiHi HiHi
我如何修改以下代码以 return 数据恢复到原来的状态?
df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)
对于上下文,这个问题是由 Ista 在这个问题下的评论提示的:
由于 Age
和 Comments
大概是在原始数据中的任何一行的水平上测量的,所以只需带上它们:
df <- data.frame(id = 1:nrow(df), df)
dfl <- gather(df, key = "key", value = "value", -id, -Age, -Comments)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2
df2 <- transform(df2, B = ifelse(is.na(A), C, A))
df2
df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)
df