在 R 中创建函数以应用于多个数据集
Create function in R to apply to multiple datasets
我有这段代码,是 Whosebug 用户推荐的,效果很好。我有
我希望将此代码应用到的几个数据集。
我是否必须不断地将每个数据集应用于代码,还是我可以做其他事情? (比如将其存储在某种函数中?)
我有数据集
df1, df2, df3, df4. I do not wish to rbind these datasets.
每个数据集的输入:
structure(list(Date = structure(1:6, .Label = c("1/2/2020 5:00:00 PM",
"1/2/2020 5:30:01 PM", "1/2/2020 6:00:00 PM", "1/5/2020 7:00:01 AM",
"1/6/2020 8:00:00 AM", "1/6/2020 9:00:00 AM"), class = "factor"),
Duration = c(20L, 30L, 10L, 5L, 2L, 8L)), class = "data.frame", row.names = c(NA,
-6L))
代码:
df %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
这就是我为每个人所做的:(等)
df1 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
df2 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
df3 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
有没有办法:
Store_code<-
df %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
然后将每个数据集轻松应用于此代码?
df1(Store_code)
df2(Store_code)
欢迎任何建议。
我们可以使用mget
将所有对象return放入一个list
,使用map
遍历list
并应用函数
library(dplyr)
library(lubridate)
library(purrr)
f1 <- function(dat) {
dat %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
}
lst1 <- map(mget(ls(pattern = "^df\d+$")), f1)
这里,我们假设所有数据集中的列名相同,即 'Date'、'Duration'。如果它是不同的,则可以作为另一个参数传递给 function
f2 <- function(dat, datecol, durationcol) {
dat %>%
group_by(Date = as.Date(dmy_hms({{datecol}}))) %>%
summarise(Total_Duration = sum({{durationcol}}), Count = n())
}
并将函数应用为
f2(df1, Date, Duration)
或者在循环中
lst1 <- map(mget(ls(pattern = "^df\d+$")), f2,
datecol = Date, durationcol = Duration)
我有这段代码,是 Whosebug 用户推荐的,效果很好。我有 我希望将此代码应用到的几个数据集。 我是否必须不断地将每个数据集应用于代码,还是我可以做其他事情? (比如将其存储在某种函数中?)
我有数据集
df1, df2, df3, df4. I do not wish to rbind these datasets.
每个数据集的输入:
structure(list(Date = structure(1:6, .Label = c("1/2/2020 5:00:00 PM",
"1/2/2020 5:30:01 PM", "1/2/2020 6:00:00 PM", "1/5/2020 7:00:01 AM",
"1/6/2020 8:00:00 AM", "1/6/2020 9:00:00 AM"), class = "factor"),
Duration = c(20L, 30L, 10L, 5L, 2L, 8L)), class = "data.frame", row.names = c(NA,
-6L))
代码:
df %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
这就是我为每个人所做的:(等)
df1 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
df2 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
df3 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
有没有办法:
Store_code<-
df %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
然后将每个数据集轻松应用于此代码?
df1(Store_code)
df2(Store_code)
欢迎任何建议。
我们可以使用mget
将所有对象return放入一个list
,使用map
遍历list
并应用函数
library(dplyr)
library(lubridate)
library(purrr)
f1 <- function(dat) {
dat %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
}
lst1 <- map(mget(ls(pattern = "^df\d+$")), f1)
这里,我们假设所有数据集中的列名相同,即 'Date'、'Duration'。如果它是不同的,则可以作为另一个参数传递给 function
f2 <- function(dat, datecol, durationcol) {
dat %>%
group_by(Date = as.Date(dmy_hms({{datecol}}))) %>%
summarise(Total_Duration = sum({{durationcol}}), Count = n())
}
并将函数应用为
f2(df1, Date, Duration)
或者在循环中
lst1 <- map(mget(ls(pattern = "^df\d+$")), f2,
datecol = Date, durationcol = Duration)