使用不纯函数循环遍历数据帧行的最优雅方法是什么?
What is most elegant way to loop through rows of a data frame with an impure function?
如果我有下面这段代码:
my_func <- function(var1, var2, var3, var4) {
... (side effect included)
}
df <- crossing(
nesting(var1=...,var2=....)
nesting(var3=...,var4=....)
)
将 my_func 应用于 df 的每一行的最优雅的方法是什么?
plus my_func 不是一个纯函数,它被设计来进行一些副作用(IO, plot ...)
方法一
my_func_wrapper <- function(row) {
my_func(row['var1'], row['var2'], row['var3'], row['var4'])
}
# Vector coercion is a problem, if variables are not the same type.
apply(df, 1, my_func_wrapper)
方法二
df %>%
rowwise() %>%
do(result=invoke(my_func, .)) %>% #If it ends here, I will be pretty happy.
.$result # Relying auto print feature to plot or trigger some side effect
方法三
#This looks pretty good on its own but it does not play well with the pipe %>%
foreach(row=iter(df, by='row')) %do% invoke(my_func, row)
#Method 3.1 (With Pipe)
df %>%
(function(df) foreach(row=iter(df, by='row')) %do% invoke(my_func, row))
#Method 3.2 this does not work
# df %>%
# foreach(row=iter(., by='row')) %do% invoke(my_func, row)
#Method 3.3 this does not work
#I am trying to get this work with purrr's simplified anonymous function, but it does not work.
# df %>%
# as_function(~ foreach(row=iter(., by='row')) %do% invoke(my_func, row))
有没有更好的方法,可以和 %>%
一起玩?
老实说,我会用 purr 的 pmap::pmap
library(tidyverse)
df = data.frame(
x = rnorm(10),
y = runif(10)
)
df %>%
pmap_dbl(function(x, y) {
min(x,y)
})
我发现 tidyverse 提供的许多此类操作仍然比 plyr 差。示例:
> library(plyr)
> library(tidyverse)
> #dummy function
> your_function = function(...) {do.call(args = list(..., sep = " + "), what = str_c)}
> alply(mpg[1:5, ], .margins = 1, .fun = function(row) {
+ your_function(row$manufacturer, row$cyl, row$trans)
+ }) %>% unlist()
1 2 3 4 5
"audi + 4 + auto(l5)" "audi + 4 + manual(m5)" "audi + 4 + manual(m6)" "audi + 4 + auto(av)" "audi + 6 + auto(l5)"
我不知道你想从函数中收集什么,但你可能想要 alply()
或 adply()
。
如果我有下面这段代码:
my_func <- function(var1, var2, var3, var4) {
... (side effect included)
}
df <- crossing(
nesting(var1=...,var2=....)
nesting(var3=...,var4=....)
)
将 my_func 应用于 df 的每一行的最优雅的方法是什么? plus my_func 不是一个纯函数,它被设计来进行一些副作用(IO, plot ...)
方法一
my_func_wrapper <- function(row) {
my_func(row['var1'], row['var2'], row['var3'], row['var4'])
}
# Vector coercion is a problem, if variables are not the same type.
apply(df, 1, my_func_wrapper)
方法二
df %>%
rowwise() %>%
do(result=invoke(my_func, .)) %>% #If it ends here, I will be pretty happy.
.$result # Relying auto print feature to plot or trigger some side effect
方法三
#This looks pretty good on its own but it does not play well with the pipe %>%
foreach(row=iter(df, by='row')) %do% invoke(my_func, row)
#Method 3.1 (With Pipe)
df %>%
(function(df) foreach(row=iter(df, by='row')) %do% invoke(my_func, row))
#Method 3.2 this does not work
# df %>%
# foreach(row=iter(., by='row')) %do% invoke(my_func, row)
#Method 3.3 this does not work
#I am trying to get this work with purrr's simplified anonymous function, but it does not work.
# df %>%
# as_function(~ foreach(row=iter(., by='row')) %do% invoke(my_func, row))
有没有更好的方法,可以和 %>%
一起玩?
老实说,我会用 purr 的 pmap::pmap
library(tidyverse)
df = data.frame(
x = rnorm(10),
y = runif(10)
)
df %>%
pmap_dbl(function(x, y) {
min(x,y)
})
我发现 tidyverse 提供的许多此类操作仍然比 plyr 差。示例:
> library(plyr)
> library(tidyverse)
> #dummy function
> your_function = function(...) {do.call(args = list(..., sep = " + "), what = str_c)}
> alply(mpg[1:5, ], .margins = 1, .fun = function(row) {
+ your_function(row$manufacturer, row$cyl, row$trans)
+ }) %>% unlist()
1 2 3 4 5
"audi + 4 + auto(l5)" "audi + 4 + manual(m5)" "audi + 4 + manual(m6)" "audi + 4 + auto(av)" "audi + 6 + auto(l5)"
我不知道你想从函数中收集什么,但你可能想要 alply()
或 adply()
。