如何重塑具有多个级别的数据框
How to reshape dataframe with multiple levels
我目前有一个格式如下所示的数据框 (df1):
ID
F1_1
F2_1r1
F2_1r2
F2_1r3
F1_2
F2_2r1
F2_2r2
F2_2r3
F1_3
F2_3r1
F2_3r2
F2_3r3
1
10
1
1
0
15
0
1
0
30
1
0
0
2
25
1
0
0
30
0
1
1
25
1
0
1
3
40
0
1
0
15
0
1
0
10
0
0
1
4
25
1
1
0
10
0
1
1
30
1
0
0
我想重新格式化它,使其在 df2 中的排列方式如下所示:
ID
F1_value
R1
R2
R3
F1_x
1
10
1
1
0
1
1
15
0
1
0
2
1
30
1
0
0
3
2
25
1
0
0
1
2
30
0
1
1
2
2
25
1
0
1
3
3
40
0
1
0
1
3
15
0
1
0
2
3
10
0
0
1
3
4
25
1
1
0
1
4
10
0
1
1
2
4
30
1
0
0
3
您可以使用 pivot_longer()
,但如果您先按以下方式重命名变量会更容易:
x <- data.frame(
ID = 1:4,
A1 = c(10,25,40,25),
A1.1=c(1,1,0,1),
A1.2=c(1,0,1,1),
A1.3=c(0,0,0,0),
B1 = c(15,30,15,10),
B1.1=c(0,0,0,0),
B1.2=c(1,1,1,1),
B1.3=c(0,1,0,1),
C1 = c(30,25,10,30),
C1.1=c(1,1,0,1),
C1.2=c(0,0,0,0),
C1.3=c(0,1,1,0)
)
x %>%
rename("A1.0" = "A1",
"B1.0" = "B1",
"C1.0" = "C1") %>%
pivot_longer(`A1.0`:`C1.3`,
names_pattern=c("([A-C])\d.(\d)"),
names_to=c("A_C", ".value"),
names_prefix = "R") %>%
rename("A1_C1_value" = "0",
"R1" = "1",
"R2" = "2",
"R3" = "3")
# # A tibble: 12 × 6
# ID A_C A1_C1_value R1 R2 R3
# <int> <chr> <dbl> <dbl> <dbl> <dbl>
# 1 1 A 10 1 1 0
# 2 1 B 15 0 1 0
# 3 1 C 30 1 0 0
# 4 2 A 25 1 0 0
# 5 2 B 30 0 1 1
# 6 2 C 25 1 0 1
# 7 3 A 40 0 1 0
# 8 3 B 15 0 1 0
# 9 3 C 10 0 0 1
# 10 4 A 25 1 1 0
# 11 4 B 10 0 1 1
# 12 4 C 30 1 0 0**
您可以使用 data.table
:
非常有效地完成此操作
library(data.table)
df1 <- data.table(df1)
df2 <- melt(df1, measure = patterns("^F1", "r1$", "r2$", "r3$"),
value.name = c("F1_value", "R1", "R2", "R3"), variable.name = "F1_x")
制作中:
ID F1_x F1_value R1 R2 R3
1: 1 1 10 1 1 0
2: 2 1 25 1 0 0
3: 3 1 40 0 1 0
4: 4 1 25 1 1 0
5: 1 2 15 0 1 0
6: 2 2 30 0 1 1
7: 3 2 15 0 1 0
8: 4 2 10 0 1 1
9: 1 3 30 1 0 0
10: 2 3 25 1 0 1
11: 3 3 10 0 0 1
12: 4 3 30 1 0 0
我目前有一个格式如下所示的数据框 (df1):
ID | F1_1 | F2_1r1 | F2_1r2 | F2_1r3 | F1_2 | F2_2r1 | F2_2r2 | F2_2r3 | F1_3 | F2_3r1 | F2_3r2 | F2_3r3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 10 | 1 | 1 | 0 | 15 | 0 | 1 | 0 | 30 | 1 | 0 | 0 |
2 | 25 | 1 | 0 | 0 | 30 | 0 | 1 | 1 | 25 | 1 | 0 | 1 |
3 | 40 | 0 | 1 | 0 | 15 | 0 | 1 | 0 | 10 | 0 | 0 | 1 |
4 | 25 | 1 | 1 | 0 | 10 | 0 | 1 | 1 | 30 | 1 | 0 | 0 |
我想重新格式化它,使其在 df2 中的排列方式如下所示:
ID | F1_value | R1 | R2 | R3 | F1_x |
---|---|---|---|---|---|
1 | 10 | 1 | 1 | 0 | 1 |
1 | 15 | 0 | 1 | 0 | 2 |
1 | 30 | 1 | 0 | 0 | 3 |
2 | 25 | 1 | 0 | 0 | 1 |
2 | 30 | 0 | 1 | 1 | 2 |
2 | 25 | 1 | 0 | 1 | 3 |
3 | 40 | 0 | 1 | 0 | 1 |
3 | 15 | 0 | 1 | 0 | 2 |
3 | 10 | 0 | 0 | 1 | 3 |
4 | 25 | 1 | 1 | 0 | 1 |
4 | 10 | 0 | 1 | 1 | 2 |
4 | 30 | 1 | 0 | 0 | 3 |
您可以使用 pivot_longer()
,但如果您先按以下方式重命名变量会更容易:
x <- data.frame(
ID = 1:4,
A1 = c(10,25,40,25),
A1.1=c(1,1,0,1),
A1.2=c(1,0,1,1),
A1.3=c(0,0,0,0),
B1 = c(15,30,15,10),
B1.1=c(0,0,0,0),
B1.2=c(1,1,1,1),
B1.3=c(0,1,0,1),
C1 = c(30,25,10,30),
C1.1=c(1,1,0,1),
C1.2=c(0,0,0,0),
C1.3=c(0,1,1,0)
)
x %>%
rename("A1.0" = "A1",
"B1.0" = "B1",
"C1.0" = "C1") %>%
pivot_longer(`A1.0`:`C1.3`,
names_pattern=c("([A-C])\d.(\d)"),
names_to=c("A_C", ".value"),
names_prefix = "R") %>%
rename("A1_C1_value" = "0",
"R1" = "1",
"R2" = "2",
"R3" = "3")
# # A tibble: 12 × 6
# ID A_C A1_C1_value R1 R2 R3
# <int> <chr> <dbl> <dbl> <dbl> <dbl>
# 1 1 A 10 1 1 0
# 2 1 B 15 0 1 0
# 3 1 C 30 1 0 0
# 4 2 A 25 1 0 0
# 5 2 B 30 0 1 1
# 6 2 C 25 1 0 1
# 7 3 A 40 0 1 0
# 8 3 B 15 0 1 0
# 9 3 C 10 0 0 1
# 10 4 A 25 1 1 0
# 11 4 B 10 0 1 1
# 12 4 C 30 1 0 0**
您可以使用 data.table
:
library(data.table)
df1 <- data.table(df1)
df2 <- melt(df1, measure = patterns("^F1", "r1$", "r2$", "r3$"),
value.name = c("F1_value", "R1", "R2", "R3"), variable.name = "F1_x")
制作中:
ID F1_x F1_value R1 R2 R3
1: 1 1 10 1 1 0
2: 2 1 25 1 0 0
3: 3 1 40 0 1 0
4: 4 1 25 1 1 0
5: 1 2 15 0 1 0
6: 2 2 30 0 1 1
7: 3 2 15 0 1 0
8: 4 2 10 0 1 1
9: 1 3 30 1 0 0
10: 2 3 25 1 0 1
11: 3 3 10 0 0 1
12: 4 3 30 1 0 0