多因素重塑长幅到宽幅
Reshape from Long to Wide Format by Multiple Factors
参考这个问题:
How to reshape data from long to wide format
set.seed(45)
dat1 <- data.frame(
name = rep(c("firstName", "secondName"), each=4),
timeperiod = c("Q1","Q2","Q3","Q4"),
height = c(2,9,1,2,11,15,16,10),
weight=c(1,4,2,8,2,9,1,2)
)
dat1
name timeperiod height weight
1 firstName Q1 2 1
2 firstName Q2 9 4
3 firstName Q3 1 2
4 firstName Q4 2 8
5 secondName Q1 11 2
6 secondName Q2 15 9
7 secondName Q3 16 1
8 secondName Q4 10 2
假设我有上面的数据框,并提供了生成代码。
我想要一个结构化的数据集:
name Variable Q1 Q2 Q3 Q4
firstName height 2 9 1 2
firstName weight 1 4 2 8
secondName height 11 15 16 10
secondName weight 2 9 1 2
寻找使用 base R 而不是 tidyverse 的解决方案。尝试使用 reshape 函数执行此操作,但对其他基本 R 函数开放。
在转换为 'wide'
之前,我们可能需要重塑为 'long'
library(dplyr)
library(tidyr)
dat1 %>%
pivot_longer(cols = c(height, weight), names_to = 'Variable') %>%
pivot_wider(names_from = "timeperiod", values_from = "value")
-输出
# A tibble: 4 x 6
name Variable Q1 Q2 Q3 Q4
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 firstName height 2 9 1 2
2 firstName weight 1 4 2 8
3 secondName height 11 15 16 10
4 secondName weight 2 9 1 2
或在base R
中使用reshape
names(dat1)[3:4] <- c("1_height", "1_weight")
reshape(reshape(dat1, direction = 'long', varying = 3:4,
sep = "_")[-5], direction = "wide",
idvar = c("name", "time"), timevar = "timeperiod")
这是另一个 Tidyr 示例
dat1 %>%
tidyr::gather(variable, Amount, - name ,-timeperiod) %>%
tidyr::spread(timeperiod, Amount, fill = 0)
基础 R:
一种方式可以是:
reshape(cbind(dat1[1:2], stack(dat1, 3:4)), timevar = 'timeperiod',
dir = 'wide', idvar = c('name', 'ind'))
name ind values.Q1 values.Q2 values.Q3 values.Q4
1 firstName height 2 9 1 2
5 secondName height 11 15 16 10
9 firstName weight 1 4 2 8
13 secondName weight 2 9 1 2
如果使用其他包,请考虑 reshape
包中的 recast
函数:
reshape2::recast(dat1, name+variable~timeperiod, id.var = c('name', 'timeperiod'))
name variable Q1 Q2 Q3 Q4
1 firstName height 2 9 1 2
2 firstName weight 1 4 2 8
3 secondName height 11 15 16 10
4 secondName weight 2 9 1 2
参考这个问题:
How to reshape data from long to wide format
set.seed(45)
dat1 <- data.frame(
name = rep(c("firstName", "secondName"), each=4),
timeperiod = c("Q1","Q2","Q3","Q4"),
height = c(2,9,1,2,11,15,16,10),
weight=c(1,4,2,8,2,9,1,2)
)
dat1
name timeperiod height weight
1 firstName Q1 2 1
2 firstName Q2 9 4
3 firstName Q3 1 2
4 firstName Q4 2 8
5 secondName Q1 11 2
6 secondName Q2 15 9
7 secondName Q3 16 1
8 secondName Q4 10 2
假设我有上面的数据框,并提供了生成代码。
我想要一个结构化的数据集:
name Variable Q1 Q2 Q3 Q4
firstName height 2 9 1 2
firstName weight 1 4 2 8
secondName height 11 15 16 10
secondName weight 2 9 1 2
寻找使用 base R 而不是 tidyverse 的解决方案。尝试使用 reshape 函数执行此操作,但对其他基本 R 函数开放。
在转换为 'wide'
之前,我们可能需要重塑为 'long'library(dplyr)
library(tidyr)
dat1 %>%
pivot_longer(cols = c(height, weight), names_to = 'Variable') %>%
pivot_wider(names_from = "timeperiod", values_from = "value")
-输出
# A tibble: 4 x 6
name Variable Q1 Q2 Q3 Q4
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 firstName height 2 9 1 2
2 firstName weight 1 4 2 8
3 secondName height 11 15 16 10
4 secondName weight 2 9 1 2
或在base R
reshape
names(dat1)[3:4] <- c("1_height", "1_weight")
reshape(reshape(dat1, direction = 'long', varying = 3:4,
sep = "_")[-5], direction = "wide",
idvar = c("name", "time"), timevar = "timeperiod")
这是另一个 Tidyr 示例
dat1 %>%
tidyr::gather(variable, Amount, - name ,-timeperiod) %>%
tidyr::spread(timeperiod, Amount, fill = 0)
基础 R:
一种方式可以是:
reshape(cbind(dat1[1:2], stack(dat1, 3:4)), timevar = 'timeperiod',
dir = 'wide', idvar = c('name', 'ind'))
name ind values.Q1 values.Q2 values.Q3 values.Q4
1 firstName height 2 9 1 2
5 secondName height 11 15 16 10
9 firstName weight 1 4 2 8
13 secondName weight 2 9 1 2
如果使用其他包,请考虑 reshape
包中的 recast
函数:
reshape2::recast(dat1, name+variable~timeperiod, id.var = c('name', 'timeperiod'))
name variable Q1 Q2 Q3 Q4
1 firstName height 2 9 1 2
2 firstName weight 1 4 2 8
3 secondName height 11 15 16 10
4 secondName weight 2 9 1 2