从 R 翻译的随机生成代码在 C++ 中失败
Random generation code translated from R fails in C++
我正在编写实现随机生成算法的代码,以便从正态分布的尾部进行采样 proposed by Christian Robert。问题是,虽然 R 中的代码可以正常工作,但如果失败,则在将其转换为 C++ 之后。我看不出有任何原因,如果能向我解释出了什么问题和原因,我将不胜感激。
请注意,下面的代码远谈不上优雅和高效,它被简化以制作可重现的示例。
这是R中的函数:
rtnormR <- function(mean = 0, sd = 1, lower = -Inf, upper = Inf) {
lower <- (lower - mean) / sd
upper <- (upper - mean) / sd
if (lower < upper && lower >= 0) {
while (TRUE) {
astar <- (lower + sqrt(lower^2 + 4)) / 2
z <- rexp(1, astar) + lower
u <- runif(1)
if ((u <= exp(-(z - astar)^2 / 2)) && (z <= upper)) break
}
} else {
z <- NaN
}
z*sd + mean
}
这里是 C++ 版本:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double rtnormCpp(double mean, double sd, double lower, double upper) {
double z_lower = (lower - mean) / sd;
double z_upper = (upper - mean) / sd;
bool stop = false;
double astar, z, u;
if (z_lower < z_upper && z_lower >= 0) {
while (!stop) {
astar = (z_lower + std::sqrt(std::pow(z_lower, 2) + 4)) / 2;
z = R::exp_rand() * astar + z_lower;
u = R::unif_rand();
if ((u <= std::exp(-std::pow(z-astar, 2) / 2)) && (z <= z_upper))
stop = true;
}
} else {
z = NAN;
}
return z*sd + mean;
}
现在比较使用这两个函数获得的样本(它们与 msm 库中的 dtnorm
函数进行比较):
xx = seq(-6, 6, by = 0.001)
hist(replicate(5000, rtnormR(mean = 0, sd = 1, lower = 3, upper = 5)), freq= FALSE, ylab = "", xlab = "", main = "rtnormR")
lines(xx, msm::dtnorm(xx, mean = 0, sd = 1, lower = 3, upper = 5), col = "red")
hist(replicate(5000, rtnormCpp(mean = 0, sd = 1, lower = 3, upper = 5)), freq= FALSE, ylab = "", xlab = "", main = "rtnormCpp")
lines(xx, msm::dtnorm(xx, mean = 0, sd = 1, lower = 3, upper = 5), col = "red")
如您所见,rtnormCpp
returns 样本有偏差。你知道为什么吗?
虽然可以在 rexp()
中使用 scale
或 rate
,但默认参数化为 rate
- 因此 rexp(1,astar)
的平均值为 1/astar
,不是 astar
。
如果将相关的C++代码行改为
z = R::exp_rand() / astar + z_lower;
似乎一切正常。
我正在编写实现随机生成算法的代码,以便从正态分布的尾部进行采样 proposed by Christian Robert。问题是,虽然 R 中的代码可以正常工作,但如果失败,则在将其转换为 C++ 之后。我看不出有任何原因,如果能向我解释出了什么问题和原因,我将不胜感激。
请注意,下面的代码远谈不上优雅和高效,它被简化以制作可重现的示例。
这是R中的函数:
rtnormR <- function(mean = 0, sd = 1, lower = -Inf, upper = Inf) {
lower <- (lower - mean) / sd
upper <- (upper - mean) / sd
if (lower < upper && lower >= 0) {
while (TRUE) {
astar <- (lower + sqrt(lower^2 + 4)) / 2
z <- rexp(1, astar) + lower
u <- runif(1)
if ((u <= exp(-(z - astar)^2 / 2)) && (z <= upper)) break
}
} else {
z <- NaN
}
z*sd + mean
}
这里是 C++ 版本:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double rtnormCpp(double mean, double sd, double lower, double upper) {
double z_lower = (lower - mean) / sd;
double z_upper = (upper - mean) / sd;
bool stop = false;
double astar, z, u;
if (z_lower < z_upper && z_lower >= 0) {
while (!stop) {
astar = (z_lower + std::sqrt(std::pow(z_lower, 2) + 4)) / 2;
z = R::exp_rand() * astar + z_lower;
u = R::unif_rand();
if ((u <= std::exp(-std::pow(z-astar, 2) / 2)) && (z <= z_upper))
stop = true;
}
} else {
z = NAN;
}
return z*sd + mean;
}
现在比较使用这两个函数获得的样本(它们与 msm 库中的 dtnorm
函数进行比较):
xx = seq(-6, 6, by = 0.001)
hist(replicate(5000, rtnormR(mean = 0, sd = 1, lower = 3, upper = 5)), freq= FALSE, ylab = "", xlab = "", main = "rtnormR")
lines(xx, msm::dtnorm(xx, mean = 0, sd = 1, lower = 3, upper = 5), col = "red")
hist(replicate(5000, rtnormCpp(mean = 0, sd = 1, lower = 3, upper = 5)), freq= FALSE, ylab = "", xlab = "", main = "rtnormCpp")
lines(xx, msm::dtnorm(xx, mean = 0, sd = 1, lower = 3, upper = 5), col = "red")
如您所见,rtnormCpp
returns 样本有偏差。你知道为什么吗?
虽然可以在 rexp()
中使用 scale
或 rate
,但默认参数化为 rate
- 因此 rexp(1,astar)
的平均值为 1/astar
,不是 astar
。
如果将相关的C++代码行改为
z = R::exp_rand() / astar + z_lower;
似乎一切正常。