pivot_wider() 在 tidyr 中,不会丢失未展开的列
pivot_wider() in tidyr without losing columns that are not spread
我知道我在这里遗漏了一些明显的东西,但我不确定如何使用 pivot_wider
将长格式的列扩展得更宽,同时又不会丢失一些我 不会丢失的重要列 想要传播。
玩具资料
df <- tibble(id = factor(rep(1:2,
each = 3)),
gender = factor(rep(c("male", "female"),
each = 3)),
age = rep(c(45, 32),
each = 3),
time = factor(rep(paste0("week", 1:3),
times = 2)),
out1 = rnorm(6),
out2 = factor(sample(letters[1:3],
size = 6,
replace = T)))
df
# output
# A tibble: 6 x 6
id gender age time out1 out2
<fct> <fct> <dbl> <fct> <dbl> <fct>
1 1 male 45 week1 -1.23 c
2 1 male 45 week2 -0.913 c
3 1 male 45 week3 -0.267 b
4 2 female 32 week1 -0.0944 b
5 2 female 32 week2 -0.147 b
6 2 female 32 week3 -0.513 c
所以我们有两个我们想要传播的时变列:out1
和 out2
以及两个时不变列(即所有时间点的值都相同),我不想传播,但 do 想保留在更广泛的数据集中。对于 out1
和 out2
的传播,以下效果很好
df %>%
pivot_wider(id_cols = id,
names_from = time,
values_from = c(out1, out2))
# output
# A tibble: 2 x 7
id out1_week1 out1_week2 out1_week3 out2_week1 out2_week2 out2_week3
<fct> <dbl> <dbl> <dbl> <fct> <fct> <fct>
1 1 0.839 1.02 1.08 a a a
2 2 0.420 -0.0687 -2.00 b a c
out1
和 out2
在 time
上的传播已经奏效,但我丢失了时不变变量 gender
和 age
。我如何保留这些?
感谢任何帮助。
df %>%
pivot_wider(id_cols = id:age,
names_from = time,
values_from = c(out1, out2))
结果
# A tibble: 2 × 9
id gender age out1_week1 out1_week2 out1_week3 out2_week1 out2_week2 out2_week3
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <fct> <fct> <fct>
1 1 male 45 -0.476 -1.46 -0.822 a c c
2 2 female 32 -0.565 0.769 -1.04 c b c
我知道我在这里遗漏了一些明显的东西,但我不确定如何使用 pivot_wider
将长格式的列扩展得更宽,同时又不会丢失一些我 不会丢失的重要列 想要传播。
玩具资料
df <- tibble(id = factor(rep(1:2,
each = 3)),
gender = factor(rep(c("male", "female"),
each = 3)),
age = rep(c(45, 32),
each = 3),
time = factor(rep(paste0("week", 1:3),
times = 2)),
out1 = rnorm(6),
out2 = factor(sample(letters[1:3],
size = 6,
replace = T)))
df
# output
# A tibble: 6 x 6
id gender age time out1 out2
<fct> <fct> <dbl> <fct> <dbl> <fct>
1 1 male 45 week1 -1.23 c
2 1 male 45 week2 -0.913 c
3 1 male 45 week3 -0.267 b
4 2 female 32 week1 -0.0944 b
5 2 female 32 week2 -0.147 b
6 2 female 32 week3 -0.513 c
所以我们有两个我们想要传播的时变列:out1
和 out2
以及两个时不变列(即所有时间点的值都相同),我不想传播,但 do 想保留在更广泛的数据集中。对于 out1
和 out2
的传播,以下效果很好
df %>%
pivot_wider(id_cols = id,
names_from = time,
values_from = c(out1, out2))
# output
# A tibble: 2 x 7
id out1_week1 out1_week2 out1_week3 out2_week1 out2_week2 out2_week3
<fct> <dbl> <dbl> <dbl> <fct> <fct> <fct>
1 1 0.839 1.02 1.08 a a a
2 2 0.420 -0.0687 -2.00 b a c
out1
和 out2
在 time
上的传播已经奏效,但我丢失了时不变变量 gender
和 age
。我如何保留这些?
感谢任何帮助。
df %>%
pivot_wider(id_cols = id:age,
names_from = time,
values_from = c(out1, out2))
结果
# A tibble: 2 × 9
id gender age out1_week1 out1_week2 out1_week3 out2_week1 out2_week2 out2_week3
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <fct> <fct> <fct>
1 1 male 45 -0.476 -1.46 -0.822 a c c
2 2 female 32 -0.565 0.769 -1.04 c b c