如何对我的两个表进行复杂的绑定?
How to make complicated binding of my two tables?
我有一个包含所有可能“类型”的列类型的单列数据框:
type
enter
open
close
update
delete
我从我的数据库中获取数据框。但在该数据框中,并非所有“类型”都是。这是 table:
的例子
ID date type value
a1 2020-09-01 enter 18
a1 2020-09-01 close 15
a1 2020-09-02 enter 4
a2 2020-09-01 close 10
b1 2020-09-02 update 10
如您所见,ID a1 只有两种类型:enter 和 close。 a2 只有关闭,b1 只有更新。
我想以这种方式绑定这两个 table,因此不在我的 table 中的“类型”对于每个 ID 和日期的值为零。那么,如何绑定这两个 table 来得到这个:
ID date type value
a1 2020-09-01 enter 18
a1 2020-09-01 open 0
a1 2020-09-01 close 15
a1 2020-09-01 update 0
a1 2020-09-01 delete 0
a1 2020-09-02 enter 4
a1 2020-09-02 open 0
a1 2020-09-02 close 0
a1 2020-09-02 update 0
a1 2020-09-02 delete 0
a2 2020-09-01 enter 0
a2 2020-09-01 open 0
a2 2020-09-01 close 10
a2 2020-09-01 update 0
a2 2020-09-01 delete 0
b1 2020-09-01 enter 0
b1 2020-09-01 open 0
b1 2020-09-01 close 0
b1 2020-09-01 update 10
b1 2020-09-01 delete 0
我该怎么做?
您可以尝试使用 complete
:
library(dplyr)
library(tidyr)
df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0))
# ID date type value
# <chr> <chr> <fct> <dbl>
# 1 a1 2020-09-01 enter 18
# 2 a1 2020-09-01 open 0
# 3 a1 2020-09-01 close 15
# 4 a1 2020-09-01 update 0
# 5 a1 2020-09-01 delete 0
# 6 a1 2020-09-02 enter 4
# 7 a1 2020-09-02 open 0
# 8 a1 2020-09-02 close 0
# 9 a1 2020-09-02 update 0
#10 a1 2020-09-02 delete 0
#11 a2 2020-09-01 enter 0
#12 a2 2020-09-01 open 0
#13 a2 2020-09-01 close 10
#14 a2 2020-09-01 update 0
#15 a2 2020-09-01 delete 0
#16 b1 2020-09-02 enter 0
#17 b1 2020-09-02 open 0
#18 b1 2020-09-02 close 0
#19 b1 2020-09-02 update 10
#20 b1 2020-09-02 delete 0
我有一个包含所有可能“类型”的列类型的单列数据框:
type
enter
open
close
update
delete
我从我的数据库中获取数据框。但在该数据框中,并非所有“类型”都是。这是 table:
的例子ID date type value
a1 2020-09-01 enter 18
a1 2020-09-01 close 15
a1 2020-09-02 enter 4
a2 2020-09-01 close 10
b1 2020-09-02 update 10
如您所见,ID a1 只有两种类型:enter 和 close。 a2 只有关闭,b1 只有更新。
我想以这种方式绑定这两个 table,因此不在我的 table 中的“类型”对于每个 ID 和日期的值为零。那么,如何绑定这两个 table 来得到这个:
ID date type value
a1 2020-09-01 enter 18
a1 2020-09-01 open 0
a1 2020-09-01 close 15
a1 2020-09-01 update 0
a1 2020-09-01 delete 0
a1 2020-09-02 enter 4
a1 2020-09-02 open 0
a1 2020-09-02 close 0
a1 2020-09-02 update 0
a1 2020-09-02 delete 0
a2 2020-09-01 enter 0
a2 2020-09-01 open 0
a2 2020-09-01 close 10
a2 2020-09-01 update 0
a2 2020-09-01 delete 0
b1 2020-09-01 enter 0
b1 2020-09-01 open 0
b1 2020-09-01 close 0
b1 2020-09-01 update 10
b1 2020-09-01 delete 0
我该怎么做?
您可以尝试使用 complete
:
library(dplyr)
library(tidyr)
df2 %>%
mutate(type = factor(type, levels = df1$type)) %>%
group_by(ID, date) %>%
complete(type, fill = list(value = 0))
# ID date type value
# <chr> <chr> <fct> <dbl>
# 1 a1 2020-09-01 enter 18
# 2 a1 2020-09-01 open 0
# 3 a1 2020-09-01 close 15
# 4 a1 2020-09-01 update 0
# 5 a1 2020-09-01 delete 0
# 6 a1 2020-09-02 enter 4
# 7 a1 2020-09-02 open 0
# 8 a1 2020-09-02 close 0
# 9 a1 2020-09-02 update 0
#10 a1 2020-09-02 delete 0
#11 a2 2020-09-01 enter 0
#12 a2 2020-09-01 open 0
#13 a2 2020-09-01 close 10
#14 a2 2020-09-01 update 0
#15 a2 2020-09-01 delete 0
#16 b1 2020-09-02 enter 0
#17 b1 2020-09-02 open 0
#18 b1 2020-09-02 close 0
#19 b1 2020-09-02 update 10
#20 b1 2020-09-02 delete 0