在 R 中有 2 个参数的函数中按组使用 lapply

Question

我想计算不同组中几个变量的roc，只固定响应变量，下面是我一直在尝试的:

library(pROC)
data(aSAH)
lapply(dplyr::select(aSAH,c(s100b,ndka)),roc,response = aSAH$outcome)

并输出：

$s100b

Call:
roc.default(response = ..1, predictor = X[[i]])

Data: X[[i]] in 72 controls (..1 Good) < 41 cases (..1 Poor).
Area under the curve: 0.7314

$ndka

Call:
roc.default(response = ..1, predictor = X[[i]])

Data: X[[i]] in 72 controls (..1 Good) < 41 cases (..1 Poor).
Area under the curve: 0.612

但我需要在每个性别和选定的变量中应用它。就像 group_by(gender) %>% roc()

谢谢！

Answer 1

使用 by.

by(aSAH, aSAH$gender, function(x) 
  lapply(x[c("s100b", "ndka")], function(y) roc(y, response=x$outcome)))

# aSAH$gender: Male
# $s100b
# 
# Call:
#   roc.default(response = x$outcome, predictor = y)
# 
# Data: y in 22 controls (x$outcome Good) < 20 cases (x$outcome Poor).
# Area under the curve: 0.7727
# 
# $ndka
# 
# Call:
#   roc.default(response = x$outcome, predictor = y)
# 
# Data: y in 22 controls (x$outcome Good) < 20 cases (x$outcome Poor).
# Area under the curve: 0.5523
# 
# --------------------------------------------------- 
#   aSAH$gender: Female
# $s100b
# 
# Call:
#   roc.default(response = x$outcome, predictor = y)
# 
# Data: y in 50 controls (x$outcome Good) < 21 cases (x$outcome Poor).
# Area under the curve: 0.72
# 
# $ndka
# 
# Call:
#   roc.default(response = x$outcome, predictor = y)
# 
# Data: y in 50 controls (x$outcome Good) < 21 cases (x$outcome Poor).
# Area under the curve: 0.6671

在 R 中有 2 个参数的函数中按组使用 lapply

Use lapply by group in a function with 2 arguments in R

r

lapply

dplyr