Rcpp 函数慢于 Rf_eval

Question

我一直在研究一个包，它使用 Rcpp 在一组大型医学成像文件上应用任意 R 代码。我注意到我的 Rcpp 实现比原来的纯 C 版本慢得多。我追踪了通过 Function 调用函数与原始 Rf_eval 的区别。我的问题是为什么会出现接近 4 倍的性能下降，有没有办法加快函数调用的速度使其性能更接近 Rf_eval?

示例：

library(Rcpp)                                                                                                                                                          
library(inline)                                                                                                                                                        
library(microbenchmark)                                                                                                                                                

cpp_fun1 <-                                                                                                                                                            
  '                                                                                                                                                                    
Rcpp::List lots_of_calls(Function fun, NumericVector vec){                                                                                                             
  Rcpp::List output(1000);                                                                                                                                             
  for(int i = 0; i < 1000; ++i){                                                                                                                                       
    output[i] = fun(NumericVector(vec));                                                                                                                               
  }                                                                                                                                                                    
  return output;                                                                                                                                                       
}                                                                                                                                                                      
'                                                                                                                                                                      

cpp_fun2 <-                                                                                                                                                            
  '                                                                                                                                                                    
Rcpp::List lots_of_calls2(SEXP fun, SEXP env){                                                                                                                         
  Rcpp::List output(1000);                                                                                                                                             
  for(int i = 0; i < 1000; ++i){                                                                                                                                       
    output[i] = Rf_eval(fun, env);                                                                                                                                     
  }                                                                                                                                                                    
  return output;                                                                                                                                                       
}                                                                                                                                                                      
'                                                                                                                                                                      

lots_of_calls <- cppFunction(cpp_fun1)                                                                                                                                 
lots_of_calls2 <- cppFunction(cpp_fun2)                                                                                                                                

microbenchmark(lots_of_calls(mean, 1:1000),                                                                                                                            
               lots_of_calls2(quote(mean(1:1000)), .GlobalEnv))

结果

Unit: milliseconds
                                            expr      min       lq     mean   median       uq      max neval
                     lots_of_calls(mean, 1:1000) 38.23032 38.80177 40.84901 39.29197 41.62786 54.07380   100
 lots_of_calls2(quote(mean(1:1000)), .GlobalEnv) 10.53133 10.71938 11.08735 10.83436 11.03759 18.08466   100

Answer 1

Rcpp 很棒，因为它让程序员看起来很荒谬 clean。清洁度以模板化响应的形式和一组降低执行时间的假设为代价。但是，通用代码设置与特定代码设置就是这种情况。

以与之关联的 Rcpp::Function. The initial construction and then outside call to a modified version of Rf_reval requires a special Rcpp specific eval function given in Rcpp_eval.h. In turn, this function is wrapped in protections to protect against a function error when calling into R via a Shield 的呼叫路由为例。等等...

相比之下，Rf_eval两者都没有。如果它失败了，你将在没有桨的情况下逆流而上。（当然，除非你通过 R_tryEval 为它 implement error catching。）

话虽这么说，但加快计算速度的最佳方法是简单地在 C++ 中写入计算所需的所有内容。

Answer 2

除了@coatless 提出的观点之外，您甚至没有将苹果与苹果进行比较。您的 Rf_eval 示例 没有将向量 传递给函数，更重要的是，通过 quote() 对函数进行了操作。

总之，有点傻。

下面是使用 sugar 函数的更完整的示例 mean()。

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
List callFun(Function fun, NumericVector vec) {
  List output(1000);
  for(int i = 0; i < 1000; ++i){
    output[i] = fun(NumericVector(vec));
  }
  return output;
}

// [[Rcpp::export]]
List callRfEval(SEXP fun, SEXP env){
  List output(1000);
  for(int i = 0; i < 1000; ++i){
    output[i] = Rf_eval(fun, env);
  }
  return output;
}

// [[Rcpp::export]]
List callSugar(NumericVector vec) {
  List output(1000);
  for(int i = 0; i < 1000; ++i){
    double d = mean(vec);
    output[i] = d;
  }
  return output;
}

/*** R
library(microbenchmark)
microbenchmark(callFun(mean, 1:1000),   
               callRfEval(quote(mean(1:1000)), .GlobalEnv),
               callSugar(1:1000))
*/

你可以sourceCpp()这个：

R> sourceCpp("/tmp/ch.cpp")

R> library(microbenchmark)

R> microbenchmark(callFun(mean, 1:1000), 
+                callRfEval(quote(mean(1:1000)), .GlobalEnv),
+                callSugar(1:1000))
Unit: milliseconds
                                        expr      min       lq     mean   median       uq       max neval
                       callFun(mean, 1:1000) 14.87451 15.54385 18.57635 17.78990 18.29127 114.77153   100
 callRfEval(quote(mean(1:1000)), .GlobalEnv)  3.35954  3.57554  3.97380  3.75122  4.16450   6.29339   100
                           callSugar(1:1000)  1.50061  1.50827  1.62204  1.51518  1.76683   1.84513   100
R>

Rcpp 函数慢于 Rf_eval

Rcpp Function slower than Rf_eval

r

rcpp