Rcpp 函数未返回所需的 NumericVector

Question

我正在尝试在 Rcpp 中重写 R 函数（傅里叶平滑）以加快计算速度。我的 Rcpp 函数没有返回所需的值。

我有一个向量

    x = c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5, 5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887, 5334.5, 5782, 5501, 5524.5, 5548)

我的 Rcpp 函数

cppFunction("
     NumericVector smo(NumericVector x){
        int n = x.size();
        NumericVector realpart1(5);
        NumericVector imagpart1(5);
        NumericVector sm1(n);
        for (int i = 0; i<5; i++){
            double realpart = 0;
            double imagpart = 0;
            for (int j = 0; j<n; j++) {
                realpart = realpart + 0.07142857*x[j]*cos(2 * 3.142857 * (i+1-1) * (j+2)/28);
                imagpart = imagpart + 0.07142857 * x[j] * sin(2 * 3.142857 * (i+1 - 1) * (j+2) /28);
            }
            realpart1[i]=realpart;
            imagpart1[i] = imagpart;
        }

        for (int j = 0; j<n; j++){
         double sm = realpart1[0]/2;

        for (int i=0; i<5; i++){
            sm = sm + realpart1[i]*cos(2 * 3.142857 * (i+1 - 1) * (j+2) / 28) + imagpart1[i]*sin(2 * 3.142857 * (i+1-1) * (j+2) / 28);
        }
         sm1[j] = sm;

         }
        return sm1; 
}
")

函数 smo 的输出如下所示

16804.81 16674.97 16518.58 16425.55 16453.36 16594.95 16780.77 16914.47
16922.49 16789.76 16563.30 16324.47 16147.96 16070.53 16083.19 16145.65
16210.29 16241.81 16226.19 16170.64 16099.70 16049.52 16058.45 16152.36
16328.20 16545.64 16736.58 16833.36

如果我从 function(smo) 的输出中减去值 10949.12，我将得到如下所示的期望结果

期望的输出

5855.689 5725.846 5569.459 5476.428 5504.237 5645.833 5831.647 5965.351
5973.369 5840.640 5614.181 5375.346 5198.844 5121.412 5134.069 5196.534
5261.174 5292.694 5277.066 5221.517 5150.584 5100.398 5109.330 5203.243
5379.080 5596.524 5787.462 5884.235

值10949.12是NumericVector realpart1

的第一个值

我无法解决这个问题，因为我是第一次尝试 Rcpp。我已经多次检查循环，直到 realpart1 和 imagpart1 循环的计算工作正常......第二个循环有一些问题，但我无法弄清楚为什么值 10949.12 被添加到输出中。

我将非常感谢这方面的任何帮助。

等效的R代码

har = 4
pi = 22/7
realpart1 = c()
imagpart1 = c()
for (p in 1:(har+1)){
    realpart = 0
    imagpart = 0
    for (i in 1:length(x)){
        realpart = realpart + (2 /length(x)) * x[i] * cos(2 * pi * (p - 1) * (i+1) / length(x))
        imagpart = imagpart + (2 / length(x)) * x[i] * sin(2 * pi * (p - 1) * (i+1) / length(x))
    }
    realpart1 = c(realpart1,realpart)
    imagpart1 = c(imagpart1,imagpart)
    #print(realpart)
    #print(imagpart)
}   
sm1 = c()
for (i in 1:length(x)){

    sm = realpart1[1]/2

    for (p in 2:(har+1)){
        sm = sm + realpart1[p]*cos(2 * pi * (p - 1) * (i+1) / length(x))+ imagpart1[p]*sin(2 * pi * (p - 1) * (i+1) / length(x))
    }
    sm1 = c(sm1,sm)
}

Answer 1

第二个for循环中的嵌套for循环的限制有所不同。在 R 中它从 2 变为 5，而在 C++ 中它从 0 变为 4。在 C++ 中它应该从 1 变为 4 以与 R 相媲美。

但是，您可能可以通过避免在循环内动态增长向量来使 R 代码更快。在几乎不需要的 for 循环中，因为您事先知道结果向量的大小并且可以使用例如realpart <- numeric(length = har + 1) 和 realpart[p] <- ....

然而，在这种情况下，可以更进一步，根据矩阵和线性代数来表述问题，完全避免（显式）循环：

x <- c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5,
       5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887,
       5334.5, 5782, 5501, 5524.5, 5548)
fourier_smooth <- function(x, har) {
    pi <- 22 / 7 # this should be removed!
    phase <- 2 * pi * outer(seq_len(har + 1) - 1, seq_along(x) + 1) / length(x) 
    real <- 2 / length(x) * cos(phase) %*% x
    imag <- 2 / length(x) * sin(phase) %*% x
    y <- t(cos(phase)) %*% real + t(sin(phase)) %*% imag
    as.numeric(y - real[1]/2)
}
fourier_smooth(x, 4)
#>  [1] 5855.695 5725.852 5569.463 5476.432 5504.240 5645.837 5831.651
#>  [8] 5965.355 5973.373 5840.644 5614.185 5375.350 5198.848 5121.417
#> [15] 5134.073 5196.538 5261.177 5292.697 5277.070 5221.522 5150.588
#> [22] 5100.402 5109.334 5203.247 5379.084 5596.528 5787.468 5884.242

^{由 reprex package (v0.3.0)}

于 2019-08-13 创建

请注意，我包括 pi 的重新定义只是为了重现您想要的结果。为了获得正确的结果，应该使用 pi 的真实值。

然而，在 FFT 中使用 R 的构建更快：

x <- c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5,
       5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887,
       5334.5, 5782, 5501, 5524.5, 5548)
fourier_smooth <- function(x, har) {
    phase <- 2 * pi * outer(seq_len(har + 1) - 1, seq_along(x) - 1) / length(x) 
    real <- 2 / length(x) * cos(phase) %*% x
    imag <- 2 / length(x) * sin(phase) %*% x
    y <- t(cos(phase)) %*% real + t(sin(phase)) %*% imag
    as.numeric(y - real[1]/2)
}

fourier_smooth2 <- function(x, har) {
    y <- fft(x, inverse = TRUE) / length(x)
    y[(har+2):(length(x)-har)] <- 0 # filter higher harmonics while keeping the symmetry for real input
    Re(fft(y)) # result is already real
}

bench::mark(fourier_smooth(x, 4), fourier_smooth2(x, 4))[1:5]
#> # A tibble: 2 x 5
#>   expression                 min   median `itr/sec` mem_alloc
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>
#> 1 fourier_smooth(x, 4)   31.66µs  34.97µs    26342.    4.13MB
#> 2 fourier_smooth2(x, 4)   4.82µs   5.49µs   152845.    3.98KB

^{由 reprex package (v0.3.0)}

于 2019-08-13 创建

删除了 pi 的重新定义以确保结果相等。
过滤有点棘手，但我不知道有什么函数是专门为实时序列量身定做的。

Rcpp 函数未返回所需的 NumericVector

Rcpp function not returning desired NumericVector

r

rcpp