map() 的两个输入，但只迭代其中一个

Question

我正在尝试编写一个使用 map2() 迭代向量的函数，但它也需要另一个在每次调用时固定的输入。

例如，这段代码只需要一个输入，vars:

library(carData)
library(purrr)
library(tidyverse)
library(Matching)

vars <- c("lfp", "lwg", "inc")
names(vars) <- vars

matching_fcn <- function(.x){
  
  matching_df <- Mroz %>% 
  mutate(wc = case_when(wc == "yes" ~ "TRUE", 
                        wc == "no" ~ "FALSE")) %>% 
  drop_na(k5, k618, age, wc, hc, .x)

matching_df$wc <- as.logical(matching_df$wc)
  ps1 <- glm(wc ~ k5 + k618 + age + hc, 
             family = binomial, data = matching_df)
  
  pscore <- ps1$fitted.values
  matching_df <- cbind(matching_df, pscore)
  
  Y <- matching_df[[.x]]
  Tr <- as.logical(matching_df$wc)
  
  psm1 <- Matching::Match(
    Y = Y, 
    Tr = Tr, 
    X = pscore, 
    estimand = "ATT", 
    M = 1, 
    replace = TRUE, 
    caliper = 0.05, 
    version = "fast")
  p <- 1 - pnorm(abs(psm1$est.noadj/psm1$se.standard))
  with(psm1, tibble(dv=.x, est=est.noadj, se=se.standard, p=p, ndrops=ndrops))
}

purrr::map_df(
  .x =  tidyselect::all_of(vars),
  .f = matching_fcn)

但是如果我想运行具有相同变量名的不同 df 上的相同模型，我将需要再次复制整个函数并更改子行 matching_df <- .... -最优。

我尝试使用 map2() 来解决这个问题，但它返回一个错误，即 .x 和 .y 的维度不相同（自然），因为它试图迭代.y 以及 .x.

我想要的是能够像这样设置函数：

matching_fcn <- function(.x, .y){
  
  matching_df <- .y %>% ...

并这样称呼它：

purrr::map2_df(
  .x = tidyselect::all_of(vars),
  .y = df1,
  .f = matching_fcn)

或

purrr::map2_df(
  .x = tidyselect::all_of(vars),
  .y = df2,
  .f = matching_fcn)

等这可能吗？

Answer 1

如果我没有理解错的话，我认为你不应该在这里使用 map2。正如@Limey 提到的，您可以在函数中包含数据作为参数。

library(carData)
library(purrr)
library(tidyverse)
library(Matching)

vars <- c("lfp", "lwg", "inc")
names(vars) <- vars

matching_fcn <- function(data, .x){
  
  matching_df <- data %>% 
    mutate(wc = case_when(wc == "yes" ~ "TRUE", 
                          wc == "no" ~ "FALSE")) %>% 
    drop_na(k5, k618, age, wc, hc, .x)
  
  matching_df$wc <- as.logical(matching_df$wc)
  ps1 <- glm(wc ~ k5 + k618 + age + hc, 
             family = binomial, data = matching_df)
  
  pscore <- ps1$fitted.values
  matching_df <- cbind(matching_df, pscore)
  
  Y <- matching_df[[.x]]
  Tr <- as.logical(matching_df$wc)
  
  psm1 <- Matching::Match(
    Y = Y, 
    Tr = Tr, 
    X = pscore, 
    estimand = "ATT", 
    M = 1, 
    replace = TRUE, 
    caliper = 0.05, 
    version = "fast")
  p <- 1 - pnorm(abs(psm1$est.noadj/psm1$se.standard))
  with(psm1, tibble(dv=.x, est=est.noadj, se=se.standard, p=p, ndrops=ndrops))
}

称其为 -

purrr::map_df(
  .x =  tidyselect::all_of(vars),
  .f = ~matching_fcn(Mroz, .x))

#  dv      est     se        p ndrops
#  <chr> <dbl>  <dbl>    <dbl>  <dbl>
#1 lfp   0.175 0.0450 5.07e- 5      5
#2 lwg   0.375 0.0571 2.49e-11      5
#3 inc   4.84  1.09   4.82e- 6      5

map() 的两个输入，但只迭代其中一个

Two inputs to map() but only iterate over one of them

r

dplyr

purrr