c++ 的算术运算性能不佳
Bad performance of c++ for arithmetic operations
我在 R 和 C++(使用 rcpp)中仅使用算术运算编写了一个非常简单的函数。比较这两个函数表明我的 C++ 实现比我的 R 代码慢得多,这让我很困惑。
C++版本:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector dn_cpp(NumericVector x, NumericVector sigma, NumericVector mu) {
return 1/(sqrt(2*M_PI)*sigma) * exp(pow((x-mu),2)/(-2*pow(sigma, 2)) );
}
R 版本:
dn_r <- function(x, sigma, mu) {
1/(sqrt(2*pi)*sigma) * exp((x-mu)^2/(-2*sigma^2) )
}
两者比较:
library(microbenchmark)
microbenchmark(
dn_r(1,1,1),
dn_cpp(1,1,1),
times = 10000
)
# Unit: nanoseconds
# expr min lq mean median uq max neval
# dn_r(1, 1, 1) 509 567 667.1547 627 715.5 12690 10000
# dn_cpp(1, 1, 1) 1094 1242 1713.8351 1335 1479.0 3192711 10000
任何人都可以解释为什么我的 c++ 函数缺乏性能吗?
像往常一样,李哲源 and Dirk Eddelbuettel are completely right; for these sorts of operations there is absolutely no reason to expect the C++ version to be faster than the R version for the data you called the functions with. I add this answer only to demonstrate the suggestion by 李哲源:
microbenchmark(
dn_r(1,1,1),
dn_cpp(1,1,1),
times = 10000
)
Unit: microseconds
expr min lq mean median uq max neval
dn_r(1, 1, 1) 4.061 4.390 7.569112 4.869 5.175 26308.271 10000
dn_cpp(1, 1, 1) 8.362 9.025 12.148559 9.265 9.653 5834.242 10000
microbenchmark(
dn_r(rnorm(1e3), 1, 1),
dn_cpp(rnorm(1e3), 1, 1),
times = 10000
)
Unit: microseconds
expr min lq mean median uq max
dn_r(rnorm(1000), 1, 1) 298.134 303.631 313.9681 305.453 308.7080 4111.497
dn_cpp(rnorm(1000), 1, 1) 199.949 205.571 214.6522 207.414 210.5015 3859.939
microbenchmark(
dn_r(rnorm(1e5), 1, 1),
dn_cpp(rnorm(1e5), 1, 1),
times = 10000
)
Unit: milliseconds
expr min lq mean median uq max
dn_r(rnorm(1e+05), 1, 1) 28.60395 29.28238 30.85371 29.46879 29.95939 160.0769
dn_cpp(rnorm(1e+05), 1, 1) 18.89528 19.44148 20.10618 19.60433 19.75410 143.7250
对于短向量,额外的开销意味着 R 版本会更快,而使用更长的向量可以获得一些性能提升。
我在 R 和 C++(使用 rcpp)中仅使用算术运算编写了一个非常简单的函数。比较这两个函数表明我的 C++ 实现比我的 R 代码慢得多,这让我很困惑。
C++版本:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector dn_cpp(NumericVector x, NumericVector sigma, NumericVector mu) {
return 1/(sqrt(2*M_PI)*sigma) * exp(pow((x-mu),2)/(-2*pow(sigma, 2)) );
}
R 版本:
dn_r <- function(x, sigma, mu) {
1/(sqrt(2*pi)*sigma) * exp((x-mu)^2/(-2*sigma^2) )
}
两者比较:
library(microbenchmark)
microbenchmark(
dn_r(1,1,1),
dn_cpp(1,1,1),
times = 10000
)
# Unit: nanoseconds
# expr min lq mean median uq max neval
# dn_r(1, 1, 1) 509 567 667.1547 627 715.5 12690 10000
# dn_cpp(1, 1, 1) 1094 1242 1713.8351 1335 1479.0 3192711 10000
任何人都可以解释为什么我的 c++ 函数缺乏性能吗?
像往常一样,李哲源 and Dirk Eddelbuettel are completely right; for these sorts of operations there is absolutely no reason to expect the C++ version to be faster than the R version for the data you called the functions with. I add this answer only to demonstrate the suggestion by 李哲源:
microbenchmark(
dn_r(1,1,1),
dn_cpp(1,1,1),
times = 10000
)
Unit: microseconds
expr min lq mean median uq max neval
dn_r(1, 1, 1) 4.061 4.390 7.569112 4.869 5.175 26308.271 10000
dn_cpp(1, 1, 1) 8.362 9.025 12.148559 9.265 9.653 5834.242 10000
microbenchmark(
dn_r(rnorm(1e3), 1, 1),
dn_cpp(rnorm(1e3), 1, 1),
times = 10000
)
Unit: microseconds
expr min lq mean median uq max
dn_r(rnorm(1000), 1, 1) 298.134 303.631 313.9681 305.453 308.7080 4111.497
dn_cpp(rnorm(1000), 1, 1) 199.949 205.571 214.6522 207.414 210.5015 3859.939
microbenchmark(
dn_r(rnorm(1e5), 1, 1),
dn_cpp(rnorm(1e5), 1, 1),
times = 10000
)
Unit: milliseconds
expr min lq mean median uq max
dn_r(rnorm(1e+05), 1, 1) 28.60395 29.28238 30.85371 29.46879 29.95939 160.0769
dn_cpp(rnorm(1e+05), 1, 1) 18.89528 19.44148 20.10618 19.60433 19.75410 143.7250
对于短向量,额外的开销意味着 R 版本会更快,而使用更长的向量可以获得一些性能提升。