在 R 中使用条件语句取消过滤数据框
Unfilter data frame with conditional statement in R
我有两个不同的数据框DF1和DF2。我想排除 DF1 中与数据框 DF2 匹配的行,我得到的数据框看起来像 DF3。此外
我想传递条件,因为如果我的房间号是 All Rooms,那么我将能够匹配从 DF2 到 DF1 的 Code、Description 和 Company 列,如果我的房间号列 不包含所有房间 那么它应该匹配列 代码、描述、公司和房间号 。
Code=c("A","B","C","E","D")
Desciption=c("Color is not Good","Odour is not good","Astetic Issue","Odour is not good","Lighting issue")
Company=c("Asian Paints","Burger","Asian Paints","Burger","Burger")
`Room number`=c("Room_1","Room_1","Room_2","Room_3","Room_2")
Rating=c("2","3","5","4","3")
DF1=data.frame(Code,Desciption,Company,`Room number`,Rating)
Code Desciption Company Room.number Rating
1 A Color is not Good Asian Paints Room_1 2
2 B Odour is not good Burger Room_1 3
3 C Astetic Issue Asian Paints Room_2 5
4 E Odour is not good Burger Room_3 4
5 D Lighting issue Burger Room_2 3
Code=c("A","B")
Desciption=c("Color is not Good","Odour is not good")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_1","All Rooms")
DF2=data.frame(Code,Desciption,Company,`Room number`)
> DF2
Code Desciption Company Room.number
1 A Color is not Good Asian Paints Room_1
2 B Odour is not good Burger All Rooms
Code=c("C","D")
Desciption=c("Astetic Issue","Lighting issue")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_2","Room_2")
Rating=c("5","3")
DF3=data.frame(Code,Desciption,Company,`Room number`,Rating)
> DF3
Code Desciption Company Room.number Rating
1 C Astetic Issue Asian Paints Room_2 5
2 D Lighting issue Burger Room_2 3
谁能帮我解决这个问题
你提到过
Additionally I want Pass the condition as If my Room number is All Rooms then I would be able to match columns Code, Description and Company from DF2 to DF1,..
不清楚在这种特定情况下(所有房间)是否要检查 DF1
中所有 codes
的 description & company
?如果是,下面的语法就可以了..
否则,如果所有组合都必须在所有列的所有可能组合中检入 DF1
(即 code
、description
和 company
),过滤后的行将是 0
。请重新检查你的逻辑并相应地修改问题
DF1 %>% anti_join(DF2, by = c("Code", "Desciption", "Company", "Room.number")) %>%
anti_join(DF2 %>% filter(Room.number == "All Rooms") %>%
mutate(Code = list(unique(DF1$Code))) %>%
unnest_longer(Code) ,
by = c("Code", "Desciption", "Company"))
Code Desciption Company Room.number Rating
1 C Astetic Issue Asian Paints Room_2 5
2 D Lighting issue Burger Room_2 3
这是一种基于 R 的矢量化方法,可以过滤掉符合多个条件的行。它创建逻辑索引,然后根据这些索引创建子集 DF1
。 DF3b
和预期结果 DF3
之间的唯一区别在于行名称,因此我将它们设置为连续值。
i_all_rooms <- DF1[["Room.number"]] == "All Rooms"
i1 <- !DF1[["Code"]] %in% DF2[["Code"]]
i2 <- !DF1[["Desciption"]] %in% DF2[["Desciption"]]
i3 <- !DF1[["Company"]] %in% DF2[["Company"]]
i4 <- !DF1[["Room.number"]] %in% DF2[["Room.number"]]
j1 <- i_all_rooms & i1 & (i2 | i3)
j2 <- !i_all_rooms & i1 & (i2 | i3) & i4
DF3b <- DF1[j1 | j2, ]
row.names(DF3b) <- NULL
identical(DF3, DF3b)
#[1] TRUE
我有两个不同的数据框DF1和DF2。我想排除 DF1 中与数据框 DF2 匹配的行,我得到的数据框看起来像 DF3。此外 我想传递条件,因为如果我的房间号是 All Rooms,那么我将能够匹配从 DF2 到 DF1 的 Code、Description 和 Company 列,如果我的房间号列 不包含所有房间 那么它应该匹配列 代码、描述、公司和房间号 。
Code=c("A","B","C","E","D")
Desciption=c("Color is not Good","Odour is not good","Astetic Issue","Odour is not good","Lighting issue")
Company=c("Asian Paints","Burger","Asian Paints","Burger","Burger")
`Room number`=c("Room_1","Room_1","Room_2","Room_3","Room_2")
Rating=c("2","3","5","4","3")
DF1=data.frame(Code,Desciption,Company,`Room number`,Rating)
Code Desciption Company Room.number Rating
1 A Color is not Good Asian Paints Room_1 2
2 B Odour is not good Burger Room_1 3
3 C Astetic Issue Asian Paints Room_2 5
4 E Odour is not good Burger Room_3 4
5 D Lighting issue Burger Room_2 3
Code=c("A","B")
Desciption=c("Color is not Good","Odour is not good")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_1","All Rooms")
DF2=data.frame(Code,Desciption,Company,`Room number`)
> DF2
Code Desciption Company Room.number
1 A Color is not Good Asian Paints Room_1
2 B Odour is not good Burger All Rooms
Code=c("C","D")
Desciption=c("Astetic Issue","Lighting issue")
Company=c("Asian Paints","Burger")
`Room number`=c("Room_2","Room_2")
Rating=c("5","3")
DF3=data.frame(Code,Desciption,Company,`Room number`,Rating)
> DF3
Code Desciption Company Room.number Rating
1 C Astetic Issue Asian Paints Room_2 5
2 D Lighting issue Burger Room_2 3
谁能帮我解决这个问题
你提到过
Additionally I want Pass the condition as If my Room number is All Rooms then I would be able to match columns Code, Description and Company from DF2 to DF1,..
不清楚在这种特定情况下(所有房间)是否要检查 DF1
中所有 codes
的 description & company
?如果是,下面的语法就可以了..
否则,如果所有组合都必须在所有列的所有可能组合中检入 DF1
(即 code
、description
和 company
),过滤后的行将是 0
。请重新检查你的逻辑并相应地修改问题
DF1 %>% anti_join(DF2, by = c("Code", "Desciption", "Company", "Room.number")) %>%
anti_join(DF2 %>% filter(Room.number == "All Rooms") %>%
mutate(Code = list(unique(DF1$Code))) %>%
unnest_longer(Code) ,
by = c("Code", "Desciption", "Company"))
Code Desciption Company Room.number Rating
1 C Astetic Issue Asian Paints Room_2 5
2 D Lighting issue Burger Room_2 3
这是一种基于 R 的矢量化方法,可以过滤掉符合多个条件的行。它创建逻辑索引,然后根据这些索引创建子集 DF1
。 DF3b
和预期结果 DF3
之间的唯一区别在于行名称,因此我将它们设置为连续值。
i_all_rooms <- DF1[["Room.number"]] == "All Rooms"
i1 <- !DF1[["Code"]] %in% DF2[["Code"]]
i2 <- !DF1[["Desciption"]] %in% DF2[["Desciption"]]
i3 <- !DF1[["Company"]] %in% DF2[["Company"]]
i4 <- !DF1[["Room.number"]] %in% DF2[["Room.number"]]
j1 <- i_all_rooms & i1 & (i2 | i3)
j2 <- !i_all_rooms & i1 & (i2 | i3) & i4
DF3b <- DF1[j1 | j2, ]
row.names(DF3b) <- NULL
identical(DF3, DF3b)
#[1] TRUE