C++ Armadillo 和 OpenMp:外积求和的并行化 - 定义 Armadillo 矩阵的缩减
C++ Armadillo and OpenMp: Parallelization of summation of outer products - define reduction for Armadillo matrix
我正在尝试使用 OpenMP 并行化一个 for 循环,它对犰狳矩阵求和。我有以下代码:
#include <armadillo>
#include <omp.h>
int main()
{
arma::mat A = arma::randu<arma::mat>(1000,700);
arma::mat X = arma::zeros(700,700);
arma::rowvec point = A.row(0);
# pragma omp parallel for shared(A) reduction(+:X)
for(unsigned int i = 0; i < A.n_rows; i++){
arma::rowvec diff = point - A.row(i);
X += diff.t() * diff; // Adding the matrices to X here
}
}
我收到这个错误:
[Legendre@localhost ~]$ g++ test2.cpp -o test2 -O2 -larmadillo -fopenmp
test2.cpp: In function ‘int main()’:
test2.cpp:11:52: error: user defined reduction not found for ‘X’
我阅读了有关定义缩减的内容,但没有找到使用犰狳矩阵的示例。就我而言,定义犰狳矩阵缩减的最佳方法是什么?
这些缩减仅适用于内置类型(double
、int
等)。因此,您必须自己进行还原,这很简单。只需将每个线程的结果累积到一个线程局部变量中,然后将其添加到临界区内的全局结果。
#include <armadillo>
#include <omp.h>
int main()
{
arma::mat A = arma::randu<arma::mat>(1000,700);
arma::mat X = arma::zeros(700,700);
arma::rowvec point = A.row(0);
#pragma omp parallel shared(A)
{
arma::mat X_local = arma::zeros(700,700);
#pragma omp for
for(unsigned int i = 0; i < A.n_rows; i++)
{
arma::rowvec diff = point - A.row(i);
X_local += diff.t() * diff; // Adding the matrices to X here
}
#pragma omp critical
X += X_local;
}
}
使用更新的 OpenMP(我认为是 4.5?),您还可以为您的类型声明用户定义的缩减。
#include <armadillo>
#include <omp.h>
#pragma omp declare reduction( + : arma::mat : omp_out += omp_in ) \
initializer( omp_priv = omp_orig )
int main()
{
arma::mat A = arma::randu<arma::mat>(1000,700);
arma::mat X = arma::zeros(700,700);
arma::rowvec point = A.row(0);
#pragma omp parallel shared(A) reduction(+:X)
for(unsigned int i = 0; i < A.n_rows; i++)
{
arma::rowvec diff = point - A.row(i);
X += diff.t() * diff; // Adding the matrices to X here
}
}
我正在尝试使用 OpenMP 并行化一个 for 循环,它对犰狳矩阵求和。我有以下代码:
#include <armadillo>
#include <omp.h>
int main()
{
arma::mat A = arma::randu<arma::mat>(1000,700);
arma::mat X = arma::zeros(700,700);
arma::rowvec point = A.row(0);
# pragma omp parallel for shared(A) reduction(+:X)
for(unsigned int i = 0; i < A.n_rows; i++){
arma::rowvec diff = point - A.row(i);
X += diff.t() * diff; // Adding the matrices to X here
}
}
我收到这个错误:
[Legendre@localhost ~]$ g++ test2.cpp -o test2 -O2 -larmadillo -fopenmp
test2.cpp: In function ‘int main()’:
test2.cpp:11:52: error: user defined reduction not found for ‘X’
我阅读了有关定义缩减的内容,但没有找到使用犰狳矩阵的示例。就我而言,定义犰狳矩阵缩减的最佳方法是什么?
这些缩减仅适用于内置类型(double
、int
等)。因此,您必须自己进行还原,这很简单。只需将每个线程的结果累积到一个线程局部变量中,然后将其添加到临界区内的全局结果。
#include <armadillo>
#include <omp.h>
int main()
{
arma::mat A = arma::randu<arma::mat>(1000,700);
arma::mat X = arma::zeros(700,700);
arma::rowvec point = A.row(0);
#pragma omp parallel shared(A)
{
arma::mat X_local = arma::zeros(700,700);
#pragma omp for
for(unsigned int i = 0; i < A.n_rows; i++)
{
arma::rowvec diff = point - A.row(i);
X_local += diff.t() * diff; // Adding the matrices to X here
}
#pragma omp critical
X += X_local;
}
}
使用更新的 OpenMP(我认为是 4.5?),您还可以为您的类型声明用户定义的缩减。
#include <armadillo>
#include <omp.h>
#pragma omp declare reduction( + : arma::mat : omp_out += omp_in ) \
initializer( omp_priv = omp_orig )
int main()
{
arma::mat A = arma::randu<arma::mat>(1000,700);
arma::mat X = arma::zeros(700,700);
arma::rowvec point = A.row(0);
#pragma omp parallel shared(A) reduction(+:X)
for(unsigned int i = 0; i < A.n_rows; i++)
{
arma::rowvec diff = point - A.row(i);
X += diff.t() * diff; // Adding the matrices to X here
}
}