如何对我的两个表进行复杂的多列绑定?
How to make complicated multicolumn binding of my two tables?
我有一个包含所有可能“类型”的列类型的单列数据框:
comment type
used enter
used open
used close
used update
not_used delete
我从我的数据库中获取数据框。但在该数据框中,并非所有“类型”都是。这是 table:
的例子
ID date type value
a1 2020-09-01 enter 18
a1 2020-09-01 close 15
a1 2020-09-02 enter 4
a2 2020-09-01 close 10
b1 2020-09-02 update 10
如您所见,ID a1 只有两种类型:进入和关闭。 a2 只有关闭,b1 只有更新。
我想以这种方式绑定这两个 table,因此不在我的 table 中的“类型”对于每个 ID 和日期的值为零。那么,如何绑定这两个 table 来得到这个:
comment ID date type value
used a1 2020-09-01 enter 18
used a1 2020-09-01 open 0
used a1 2020-09-01 close 15
used a1 2020-09-01 update 0
not_used a1 2020-09-01 delete 0
used a1 2020-09-02 enter 4
used a1 2020-09-02 open 0
used a1 2020-09-02 close 0
used a1 2020-09-02 update 0
not_used a1 2020-09-02 delete 0
used a2 2020-09-01 enter 0
used a2 2020-09-01 open 0
used a2 2020-09-01 close 10
used a2 2020-09-01 update 0
not_used a2 2020-09-01 delete 0
used b1 2020-09-01 enter 0
used b1 2020-09-01 open 0
used b1 2020-09-01 close 0
used b1 2020-09-01 update 10
not_used b1 2020-09-01 delete 0
如您所见,我还保留了“评论”栏。我该怎么做?
Thise 不保留“评论”栏:
library(dplyr)
library(tidyr)
df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0))
试试这个:
library(tidyverse)
#Code
new <- df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0)) %>%
left_join(df1)
输出:
# A tibble: 20 x 5
# Groups: ID, date [4]
ID date type value comment
<chr> <chr> <chr> <dbl> <chr>
1 a1 2020-09-01 enter 18 used
2 a1 2020-09-01 open 0 used
3 a1 2020-09-01 close 15 used
4 a1 2020-09-01 update 0 used
5 a1 2020-09-01 delete 0 not_used
6 a1 2020-09-02 enter 4 used
7 a1 2020-09-02 open 0 used
8 a1 2020-09-02 close 0 used
9 a1 2020-09-02 update 0 used
10 a1 2020-09-02 delete 0 not_used
11 a2 2020-09-01 enter 0 used
12 a2 2020-09-01 open 0 used
13 a2 2020-09-01 close 10 used
14 a2 2020-09-01 update 0 used
15 a2 2020-09-01 delete 0 not_used
16 b1 2020-09-02 enter 0 used
17 b1 2020-09-02 open 0 used
18 b1 2020-09-02 close 0 used
19 b1 2020-09-02 update 10 used
20 b1 2020-09-02 delete 0 not_used
我认为一个简单的解决方案是在
之后执行 full_join
df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0)) %>%
full_join(df1, by = "type")
我有一个包含所有可能“类型”的列类型的单列数据框:
comment type
used enter
used open
used close
used update
not_used delete
我从我的数据库中获取数据框。但在该数据框中,并非所有“类型”都是。这是 table:
的例子ID date type value
a1 2020-09-01 enter 18
a1 2020-09-01 close 15
a1 2020-09-02 enter 4
a2 2020-09-01 close 10
b1 2020-09-02 update 10
如您所见,ID a1 只有两种类型:进入和关闭。 a2 只有关闭,b1 只有更新。
我想以这种方式绑定这两个 table,因此不在我的 table 中的“类型”对于每个 ID 和日期的值为零。那么,如何绑定这两个 table 来得到这个:
comment ID date type value
used a1 2020-09-01 enter 18
used a1 2020-09-01 open 0
used a1 2020-09-01 close 15
used a1 2020-09-01 update 0
not_used a1 2020-09-01 delete 0
used a1 2020-09-02 enter 4
used a1 2020-09-02 open 0
used a1 2020-09-02 close 0
used a1 2020-09-02 update 0
not_used a1 2020-09-02 delete 0
used a2 2020-09-01 enter 0
used a2 2020-09-01 open 0
used a2 2020-09-01 close 10
used a2 2020-09-01 update 0
not_used a2 2020-09-01 delete 0
used b1 2020-09-01 enter 0
used b1 2020-09-01 open 0
used b1 2020-09-01 close 0
used b1 2020-09-01 update 10
not_used b1 2020-09-01 delete 0
如您所见,我还保留了“评论”栏。我该怎么做?
Thise 不保留“评论”栏:
library(dplyr)
library(tidyr)
df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0))
试试这个:
library(tidyverse)
#Code
new <- df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0)) %>%
left_join(df1)
输出:
# A tibble: 20 x 5
# Groups: ID, date [4]
ID date type value comment
<chr> <chr> <chr> <dbl> <chr>
1 a1 2020-09-01 enter 18 used
2 a1 2020-09-01 open 0 used
3 a1 2020-09-01 close 15 used
4 a1 2020-09-01 update 0 used
5 a1 2020-09-01 delete 0 not_used
6 a1 2020-09-02 enter 4 used
7 a1 2020-09-02 open 0 used
8 a1 2020-09-02 close 0 used
9 a1 2020-09-02 update 0 used
10 a1 2020-09-02 delete 0 not_used
11 a2 2020-09-01 enter 0 used
12 a2 2020-09-01 open 0 used
13 a2 2020-09-01 close 10 used
14 a2 2020-09-01 update 0 used
15 a2 2020-09-01 delete 0 not_used
16 b1 2020-09-02 enter 0 used
17 b1 2020-09-02 open 0 used
18 b1 2020-09-02 close 0 used
19 b1 2020-09-02 update 10 used
20 b1 2020-09-02 delete 0 not_used
我认为一个简单的解决方案是在
之后执行full_join
df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0)) %>%
full_join(df1, by = "type")