使用多列重塑数据
Reshaping data with multiple columns
我将如何重塑此类数据?这只是我数据的一小段摘录,真正的数据集要长得多。因此,非常感谢任何类型长度的自动化解决方案。
data <- data.frame(id = c(1,2,3),
volume_1 = c(0.33, 0.58, 0.2),
name_1 = c("a", "b","c"),
volume_2 = c(0.3, 0.4, 0.5),
name_2 = c("x", "y", "z")
)
data
id volume_1 name_1 volume_2 name_2
1 1 0.33 a 0.3 x
2 2 0.58 b 0.4 y
3 3 0.20 c 0.5 z
对此:
foo <- data.frame(id = c(1,2,3),
a = c(0.33, 0, 0),
b = c(0, 0.58, 0),
c = c(0, 0, 0.2),
x = c(0.3, 0, 0),
y = c(0, 0.4, 0),
z = c(0, 0, 0.5)
)
foo
id a b c x y z
1 1 0.33 0.00 0.0 0.3 0.0 0.0
2 2 0.00 0.58 0.0 0.0 0.4 0.0
3 3 0.00 0.00 0.2 0.0 0.0 0.5
我知道 pivot_longer() 或 pivot_wider() 以及重塑包,但我不确定如何使用数据集中的任何长度来完成此操作.
试试这个:
library(dplyr)
library(tidyr)
#Code
new <- data %>%
pivot_wider(names_from = starts_with('name'),values_from=starts_with('volume'),
values_fill = 0)
输出:
# A tibble: 3 x 7
id volume_1_a_x volume_1_b_y volume_1_c_z volume_2_a_x volume_2_b_y volume_2_c_z
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0.33 0 0 0.3 0 0
2 2 0 0.580 0 0 0.4 0
3 3 0 0 0.2 0 0 0.5
您想要的是双 reshape
,使用 gsub
清洁名称,并将 NA
s 设置为零。
r <- reshape(reshape(data, idvar="id", varying=list(c(2, 4), c(3, 5)), direction="long"),
idvar="id", timevar="name_1", drop="time", direction="wide")
names(r) <- gsub("volume_1.", "", names(r))
r[is.na(r)] <- 0
r
# id a b c x y z
# 1.1 1 0.33 0.00 0.0 0.3 0.0 0.0
# 2.1 2 0.00 0.58 0.0 0.0 0.4 0.0
# 3.1 3 0.00 0.00 0.2 0.0 0.0 0.5
我将如何重塑此类数据?这只是我数据的一小段摘录,真正的数据集要长得多。因此,非常感谢任何类型长度的自动化解决方案。
data <- data.frame(id = c(1,2,3),
volume_1 = c(0.33, 0.58, 0.2),
name_1 = c("a", "b","c"),
volume_2 = c(0.3, 0.4, 0.5),
name_2 = c("x", "y", "z")
)
data
id volume_1 name_1 volume_2 name_2
1 1 0.33 a 0.3 x
2 2 0.58 b 0.4 y
3 3 0.20 c 0.5 z
对此:
foo <- data.frame(id = c(1,2,3),
a = c(0.33, 0, 0),
b = c(0, 0.58, 0),
c = c(0, 0, 0.2),
x = c(0.3, 0, 0),
y = c(0, 0.4, 0),
z = c(0, 0, 0.5)
)
foo
id a b c x y z
1 1 0.33 0.00 0.0 0.3 0.0 0.0
2 2 0.00 0.58 0.0 0.0 0.4 0.0
3 3 0.00 0.00 0.2 0.0 0.0 0.5
我知道 pivot_longer() 或 pivot_wider() 以及重塑包,但我不确定如何使用数据集中的任何长度来完成此操作.
试试这个:
library(dplyr)
library(tidyr)
#Code
new <- data %>%
pivot_wider(names_from = starts_with('name'),values_from=starts_with('volume'),
values_fill = 0)
输出:
# A tibble: 3 x 7
id volume_1_a_x volume_1_b_y volume_1_c_z volume_2_a_x volume_2_b_y volume_2_c_z
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0.33 0 0 0.3 0 0
2 2 0 0.580 0 0 0.4 0
3 3 0 0 0.2 0 0 0.5
您想要的是双 reshape
,使用 gsub
清洁名称,并将 NA
s 设置为零。
r <- reshape(reshape(data, idvar="id", varying=list(c(2, 4), c(3, 5)), direction="long"),
idvar="id", timevar="name_1", drop="time", direction="wide")
names(r) <- gsub("volume_1.", "", names(r))
r[is.na(r)] <- 0
r
# id a b c x y z
# 1.1 1 0.33 0.00 0.0 0.3 0.0 0.0
# 2.1 2 0.00 0.58 0.0 0.0 0.4 0.0
# 3.1 3 0.00 0.00 0.2 0.0 0.0 0.5