在整个数据框中间歇性地添加行
Add rows intermittently throughout data frame
我有这样的数据:
df<-structure(list(username = c("dan.amy", "dan.amy", "dan.amy",
"stupidski", "stupidski", "stupidski", "cbum", "cbum"), Department = c("Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute Pediatric Hematology Oncology",
"Cancer Institute Pediatric Hematology Oncology", "Cancer Institute Pediatric Hematology Oncology",
"Cancer Institute GynOnc", "Cancer Institute GynOnc"), `Access Control` = c("Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"), `Organizational Unit` = c("Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute", "Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute"),
Management_Group.y = c("Cancer Institute - Pediatric Hematology/Oncology-LCI",
"Cancer Institute - Cancer Institute", "Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI",
"Cancer Institute - Cancer Institute", "Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI",
"Cancer Institute - Pediatric Hematology/Oncology-LCI", "Cancer Institute - Cancer Institute",
"Cancer Institute - Pediatric Hematology/Oncology-LCI")), row.names = c(NA,
-8L), spec = structure(list(cols = list(username = structure(list(), class = c("collector_character",
"collector")), Department = structure(list(), class = c("collector_character",
"collector")), `Access Control` = structure(list(), class = c("collector_character",
"collector")), `Organizational Unit` = structure(list(), class = c("collector_character",
"collector")), Management_Group.y = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x0000025f8c17de80>, class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
如您所见,每个“用户名”都有几行几乎相同。相同的用户名,通常是该用户名的相同部门,所有这些人(不一定在真实数据中)的访问控制和组织单位相同,但管理组是唯一的。我想为每个在几个方面都相同的人再添加一行,除了它看起来像这样:
即对于每个用户名,将有一个新行,部门下有“通用研究部门”,组织单位下有“通用研究”,管理组下有“通用研究 - 通用研究部”。访问控制将始终为“是”。
我已经考虑过执行此操作的方法,并且在考虑是否可以使用这些变量创建一个新的示例 1 行数据框,然后“加入”它?但我认为必须有一个更简单的方法。
这是一个dplyr
方式:
library(dplyr)
df %>%
group_by(username) %>%
summarise(username = last(username)) %>%
mutate(Department = "Generic Research Department",
`Access Control` = "Yes",
`Organizational Unit` = "General Research",
Management_Group.y = paste(`Organizational Unit`, Department, sep = ' - ' )) %>%
bind_rows(df, .) %>%
arrange(username, .by_group = TRUE)
username Department `Access Control` `Organizational Unit` Management_Group.y
<chr> <chr> <chr> <chr> <chr>
1 cbum Cancer Institute GynOnc Yes Cancer Institute Cancer Institute - Cancer Institute
2 cbum Cancer Institute GynOnc Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
3 cbum Generic Research Department Yes General Research General Research - Generic Research Department
4 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
5 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Cancer Institute
6 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI
7 dan.amy Generic Research Department Yes General Research General Research - Generic Research Department
8 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Cancer Institute
9 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI
10 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
11 stupidski Generic Research Department Yes General Research General Research - Generic Research Department
我有这样的数据:
df<-structure(list(username = c("dan.amy", "dan.amy", "dan.amy",
"stupidski", "stupidski", "stupidski", "cbum", "cbum"), Department = c("Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute Pediatric Hematology Oncology",
"Cancer Institute Pediatric Hematology Oncology", "Cancer Institute Pediatric Hematology Oncology",
"Cancer Institute GynOnc", "Cancer Institute GynOnc"), `Access Control` = c("Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"), `Organizational Unit` = c("Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute", "Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute"),
Management_Group.y = c("Cancer Institute - Pediatric Hematology/Oncology-LCI",
"Cancer Institute - Cancer Institute", "Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI",
"Cancer Institute - Cancer Institute", "Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI",
"Cancer Institute - Pediatric Hematology/Oncology-LCI", "Cancer Institute - Cancer Institute",
"Cancer Institute - Pediatric Hematology/Oncology-LCI")), row.names = c(NA,
-8L), spec = structure(list(cols = list(username = structure(list(), class = c("collector_character",
"collector")), Department = structure(list(), class = c("collector_character",
"collector")), `Access Control` = structure(list(), class = c("collector_character",
"collector")), `Organizational Unit` = structure(list(), class = c("collector_character",
"collector")), Management_Group.y = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x0000025f8c17de80>, class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
如您所见,每个“用户名”都有几行几乎相同。相同的用户名,通常是该用户名的相同部门,所有这些人(不一定在真实数据中)的访问控制和组织单位相同,但管理组是唯一的。我想为每个在几个方面都相同的人再添加一行,除了它看起来像这样:
即对于每个用户名,将有一个新行,部门下有“通用研究部门”,组织单位下有“通用研究”,管理组下有“通用研究 - 通用研究部”。访问控制将始终为“是”。
我已经考虑过执行此操作的方法,并且在考虑是否可以使用这些变量创建一个新的示例 1 行数据框,然后“加入”它?但我认为必须有一个更简单的方法。
这是一个dplyr
方式:
library(dplyr)
df %>%
group_by(username) %>%
summarise(username = last(username)) %>%
mutate(Department = "Generic Research Department",
`Access Control` = "Yes",
`Organizational Unit` = "General Research",
Management_Group.y = paste(`Organizational Unit`, Department, sep = ' - ' )) %>%
bind_rows(df, .) %>%
arrange(username, .by_group = TRUE)
username Department `Access Control` `Organizational Unit` Management_Group.y
<chr> <chr> <chr> <chr> <chr>
1 cbum Cancer Institute GynOnc Yes Cancer Institute Cancer Institute - Cancer Institute
2 cbum Cancer Institute GynOnc Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
3 cbum Generic Research Department Yes General Research General Research - Generic Research Department
4 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
5 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Cancer Institute
6 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI
7 dan.amy Generic Research Department Yes General Research General Research - Generic Research Department
8 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Cancer Institute
9 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI
10 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
11 stupidski Generic Research Department Yes General Research General Research - Generic Research Department