调整代码以删除 R 中的重复过滤器
Adjust code to remove repeated filters in R
我想调整下面的代码。注意 (Id == idd, date2 ==........)
重复了几次,这是一种代码浪费。那么,你能帮我调整一下,让它只使用一次吗?这个想法是我可以生成相同的输出 table (datas
).
library(tidyverse)
library(lubridate)
library(data.table)
df1 <- structure(
list(Id=c("1","1","1","1","1","1","1","1"),
date1= c("2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
"2021-06-28","2021-06-28","2021-06-28"),
date2 = c("2021-06-30","2021-06-30","2021-07-02","2021-07-04","2021-07-04","2021-07-09","2021-07-09","2021-07-09"),
Category = c("FDE","ABC","ABC","ABC","CDE","FGE","ABC","CDE"),
Week= c("Wednesday","Wednesday","Friday","Wednesday","Wednesday","Friday","Friday","Friday"),
DR1 = c(4,1,4,3,3,4,3,5),
DRM01 = c(4,1,4,3,3,4,3,6), DRM02= c(4,2,6,7,3,2,7,4),DRM03= c(9,5,4,3,3,2,1,5),
DRM04 = c(5,4,3,3,6,2,1,9),DRM05 = c(5,4,5,3,6,2,1,9),
DRM06 = c(2,4,3,3,5,6,7,8),DRM07 = c(2,5,4,4,9,4,7,8),
DRM08 = c(0,0,0,1,2,0,0,0),DRM09 = c(0,0,0,0,0,0,0,0),DRM010 = c(0,0,0,0,0,0,0,0),DRM011 = c(4,0,0,0,0,0,0,0),
DRM012 = c(0,0,0,3,0,0,0,5),DRM013 = c(0,0,1,0,0,0,2,0),DRM014 = c(0,0,0,0,0,2,0,0)),
class = "data.frame", row.names = c(NA, -8L))
idd<-"1"
dmda<-"2021-07-04"
CategoryChosse<-"CDE"
dt1 <- as.data.table(df1)
cols <- grep("^DRM0", colnames(dt1), value = TRUE)
med <-
dt1[, (paste0(cols, "_PV")) := DR1 - .SD, .SDcols = cols
][, lapply(.SD, median), by = .(Id, Category, Week), .SDcols = paste0(cols, "_PV") ]
f2 <- function(nm, pat) grep(pat, nm, value = TRUE)
df2<-df1
nm1 <- f2(names(df2), "^DRM0\d+$")
nm2 <- f2(names(med), "_PV")
nm3 <- paste0("i.", nm2)
setDT(df2)[med,(nm2) := Map(`+`, mget(nm1), mget(nm3)), on = .(Id, Category, Week)]
SPV <- df2[, c('Id','date1', 'date2', 'Week','Category', nm2), with = FALSE]
mat1 <- df1 %>%
dplyr::filter(Id==idd, date2 == ymd(dmda), Category == CategoryChosse) %>%
select(starts_with("DRM0")) %>%
pivot_longer(cols = everything()) %>%
arrange(desc(row_number())) %>%
mutate(cs = cumsum(value)) %>%
dplyr::filter(cs == 0) %>%
pull(name)
(dropnames <- paste0(mat1, "_PV"))
SPV <- SPV %>%
filter(Id==idd,date2 == ymd(dmda), Category == CategoryChosse) %>%
select(-any_of(dropnames)) %>% data.frame()
if(length(grep("DRM0", names(SPV))) == 0) {
SPV[head(mat1,10)] <- NA_real_
}
SPV<-as.data.table(SPV)
datas<-melt(SPV[Id == idd & date2 == ymd(dmda)][,
lapply(.SD, sum, na.rm = TRUE), by = Category,
.SDcols = patterns("^DRM0")],
id.var = "Category", variable.name = "name", value.name = "val")[,
name := readr::parse_number(as.character(name))][]
colnames(datas)[-1]<-c("var1","var2")
datas<-datas[na.omit(datas[, .I[(as.Date(dmda) - min(as.Date(df1$date1) [
df1$Category == first(Category)])):max(var1)+1], .(Category)]$V1)]
datas
> datas
Category var1 var2
1: CDE 7 3
2: CDE 8 3
如果我们从过滤器数据创建副本,我们可以将过滤器的 3 个步骤替换为一次
idd<-"1"
dmda<-"2021-07-04"
CategoryChosse<-"CDE"
df1_filt <- df1 %>%
dplyr::filter(Id==idd, date2 == ymd(dmda), Category == CategoryChosse)
dt1 <- as.data.table(df1)
cols <- grep("^DRM0", colnames(dt1), value = TRUE)
med <-
dt1[, (paste0(cols, "_PV")) := DR1 - .SD, .SDcols = cols
][, lapply(.SD, median), by = .(Id, Category, Week), .SDcols = paste0(cols, "_PV") ]
f2 <- function(nm, pat) grep(pat, nm, value = TRUE)
df2_filt <- copy(df1_filt)
nm1 <- f2(names(df2_filt), "^DRM0\d+$")
nm2 <- f2(names(med), "_PV")
nm3 <- paste0("i.", nm2)
setDT(df2_filt)[med,(nm2) := Map(`+`, mget(nm1), mget(nm3)), on = .(Id, Category, Week)]
SPV <- df2_filt[, c('Id','date1', 'date2', 'Week','Category', nm2), with = FALSE]
mat1 <- df1_filt %>%
select(starts_with("DRM0")) %>%
pivot_longer(cols = everything()) %>%
arrange(desc(row_number())) %>%
mutate(cs = cumsum(value)) %>%
dplyr::filter(cs == 0) %>%
pull(name)
(dropnames <- paste0(mat1, "_PV"))
SPV <- SPV %>%
select(-any_of(dropnames)) %>% data.frame()
if(length(grep("DRM0", names(SPV))) == 0) {
SPV[head(mat1,10)] <- NA_real_
}
SPV<-as.data.table(SPV)
datas<-melt(SPV[,
lapply(.SD, sum, na.rm = TRUE), by = Category,
.SDcols = patterns("^DRM0")],
id.var = "Category", variable.name = "name", value.name = "val")[,
name := readr::parse_number(as.character(name))][]
colnames(datas)[-1]<-c("var1","var2")
datas<-datas[na.omit(datas[, .I[(as.Date(dmda) - min(as.Date(df1$date1) [
df1$Category == first(Category)])):max(var1)+1], .(Category)]$V1)]
-输出
> datas
Category var1 var2
<char> <num> <num>
1: CDE 7 3
2: CDE 8 3
我想调整下面的代码。注意 (Id == idd, date2 ==........)
重复了几次,这是一种代码浪费。那么,你能帮我调整一下,让它只使用一次吗?这个想法是我可以生成相同的输出 table (datas
).
library(tidyverse)
library(lubridate)
library(data.table)
df1 <- structure(
list(Id=c("1","1","1","1","1","1","1","1"),
date1= c("2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
"2021-06-28","2021-06-28","2021-06-28"),
date2 = c("2021-06-30","2021-06-30","2021-07-02","2021-07-04","2021-07-04","2021-07-09","2021-07-09","2021-07-09"),
Category = c("FDE","ABC","ABC","ABC","CDE","FGE","ABC","CDE"),
Week= c("Wednesday","Wednesday","Friday","Wednesday","Wednesday","Friday","Friday","Friday"),
DR1 = c(4,1,4,3,3,4,3,5),
DRM01 = c(4,1,4,3,3,4,3,6), DRM02= c(4,2,6,7,3,2,7,4),DRM03= c(9,5,4,3,3,2,1,5),
DRM04 = c(5,4,3,3,6,2,1,9),DRM05 = c(5,4,5,3,6,2,1,9),
DRM06 = c(2,4,3,3,5,6,7,8),DRM07 = c(2,5,4,4,9,4,7,8),
DRM08 = c(0,0,0,1,2,0,0,0),DRM09 = c(0,0,0,0,0,0,0,0),DRM010 = c(0,0,0,0,0,0,0,0),DRM011 = c(4,0,0,0,0,0,0,0),
DRM012 = c(0,0,0,3,0,0,0,5),DRM013 = c(0,0,1,0,0,0,2,0),DRM014 = c(0,0,0,0,0,2,0,0)),
class = "data.frame", row.names = c(NA, -8L))
idd<-"1"
dmda<-"2021-07-04"
CategoryChosse<-"CDE"
dt1 <- as.data.table(df1)
cols <- grep("^DRM0", colnames(dt1), value = TRUE)
med <-
dt1[, (paste0(cols, "_PV")) := DR1 - .SD, .SDcols = cols
][, lapply(.SD, median), by = .(Id, Category, Week), .SDcols = paste0(cols, "_PV") ]
f2 <- function(nm, pat) grep(pat, nm, value = TRUE)
df2<-df1
nm1 <- f2(names(df2), "^DRM0\d+$")
nm2 <- f2(names(med), "_PV")
nm3 <- paste0("i.", nm2)
setDT(df2)[med,(nm2) := Map(`+`, mget(nm1), mget(nm3)), on = .(Id, Category, Week)]
SPV <- df2[, c('Id','date1', 'date2', 'Week','Category', nm2), with = FALSE]
mat1 <- df1 %>%
dplyr::filter(Id==idd, date2 == ymd(dmda), Category == CategoryChosse) %>%
select(starts_with("DRM0")) %>%
pivot_longer(cols = everything()) %>%
arrange(desc(row_number())) %>%
mutate(cs = cumsum(value)) %>%
dplyr::filter(cs == 0) %>%
pull(name)
(dropnames <- paste0(mat1, "_PV"))
SPV <- SPV %>%
filter(Id==idd,date2 == ymd(dmda), Category == CategoryChosse) %>%
select(-any_of(dropnames)) %>% data.frame()
if(length(grep("DRM0", names(SPV))) == 0) {
SPV[head(mat1,10)] <- NA_real_
}
SPV<-as.data.table(SPV)
datas<-melt(SPV[Id == idd & date2 == ymd(dmda)][,
lapply(.SD, sum, na.rm = TRUE), by = Category,
.SDcols = patterns("^DRM0")],
id.var = "Category", variable.name = "name", value.name = "val")[,
name := readr::parse_number(as.character(name))][]
colnames(datas)[-1]<-c("var1","var2")
datas<-datas[na.omit(datas[, .I[(as.Date(dmda) - min(as.Date(df1$date1) [
df1$Category == first(Category)])):max(var1)+1], .(Category)]$V1)]
datas
> datas
Category var1 var2
1: CDE 7 3
2: CDE 8 3
如果我们从过滤器数据创建副本,我们可以将过滤器的 3 个步骤替换为一次
idd<-"1"
dmda<-"2021-07-04"
CategoryChosse<-"CDE"
df1_filt <- df1 %>%
dplyr::filter(Id==idd, date2 == ymd(dmda), Category == CategoryChosse)
dt1 <- as.data.table(df1)
cols <- grep("^DRM0", colnames(dt1), value = TRUE)
med <-
dt1[, (paste0(cols, "_PV")) := DR1 - .SD, .SDcols = cols
][, lapply(.SD, median), by = .(Id, Category, Week), .SDcols = paste0(cols, "_PV") ]
f2 <- function(nm, pat) grep(pat, nm, value = TRUE)
df2_filt <- copy(df1_filt)
nm1 <- f2(names(df2_filt), "^DRM0\d+$")
nm2 <- f2(names(med), "_PV")
nm3 <- paste0("i.", nm2)
setDT(df2_filt)[med,(nm2) := Map(`+`, mget(nm1), mget(nm3)), on = .(Id, Category, Week)]
SPV <- df2_filt[, c('Id','date1', 'date2', 'Week','Category', nm2), with = FALSE]
mat1 <- df1_filt %>%
select(starts_with("DRM0")) %>%
pivot_longer(cols = everything()) %>%
arrange(desc(row_number())) %>%
mutate(cs = cumsum(value)) %>%
dplyr::filter(cs == 0) %>%
pull(name)
(dropnames <- paste0(mat1, "_PV"))
SPV <- SPV %>%
select(-any_of(dropnames)) %>% data.frame()
if(length(grep("DRM0", names(SPV))) == 0) {
SPV[head(mat1,10)] <- NA_real_
}
SPV<-as.data.table(SPV)
datas<-melt(SPV[,
lapply(.SD, sum, na.rm = TRUE), by = Category,
.SDcols = patterns("^DRM0")],
id.var = "Category", variable.name = "name", value.name = "val")[,
name := readr::parse_number(as.character(name))][]
colnames(datas)[-1]<-c("var1","var2")
datas<-datas[na.omit(datas[, .I[(as.Date(dmda) - min(as.Date(df1$date1) [
df1$Category == first(Category)])):max(var1)+1], .(Category)]$V1)]
-输出
> datas
Category var1 var2
<char> <num> <num>
1: CDE 7 3
2: CDE 8 3