使用 MPI_Bcast 用于
Using MPI_Bcast for
我想在为 MPI 执行编写的代码中执行以下操作:
- 每个子进程都会创建一个包含矩阵
的 class
- 只有
属于第一个子过程的矩阵已填充
- 矩阵是
交换给所有子流程,这样所有子流程现在又一次
拥有相同的矩阵,被第一个进程修改
因此我尝试在我的代码中使用 MPI_Bcast
:
#include <iostream>
#include <mpi.h>
#include <stdio.h>
#include <armadillo>
class arma_matrix_container{
public:
arma::mat test_matrix;
arma_matrix_container(const int size, const bool first_matrix)
{
if(first_matrix)
test_matrix = arma::mat(size, size, arma::fill::ones) * 1000;
else
test_matrix = arma::mat(size, size, arma::fill::zeros);
}
void update_matrix(const arma::mat &update_matrix)
{
this->test_matrix = update_matrix;
}
double get_first_matrix_value(void)
{
double first_element = this->test_matrix(0, 0);
return first_element;
}
};
int main(int argc, char* argv[])
{
MPI_Init(&argc, &argv);
const int matrix_size = 4;
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
arma_matrix_container local_matrix(matrix_size, (world_rank == 0)?true:false);
double *matrix_pointer = local_matrix.test_matrix.memptr();
double first_element = local_matrix.get_first_matrix_value();
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("Hello world from processor %s, rank %d"
" out of %d processors,\nthe first element of the matrix is %f\n",
processor_name, world_rank, world_size, first_element);
if(world_rank == 0)
std::cout << "\nBroadcasting data: \n";
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(matrix_pointer, matrix_size * matrix_size * sizeof(double), MPI_DOUBLE, 0, MPI_COMM_WORLD);
if(world_rank == 0)
std::cout << "Data was broadcasted\n\n";
printf("Hello world from processor %s, rank %d"
" out of %d processors,\nthe first element of the matrix is %f\n",
processor_name, world_rank, world_size, first_element);
// Finalize the MPI environment.
MPI_Finalize();
}
现在,如果我删除 MPI_Bcast
-命令,我会得到(如预期的那样):
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Broadcasting data:
Hello world from processor MPI-PC, rank 3 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 1 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 2 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 3 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 1 out of 4 processors,
the first element of the matrix is 0.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Hello world from processor MPI-PC, rank 2 out of 4 processors,
the first element of the matrix is 0.000000
包括 MPI_Bcast
-命令,我得到一个段错误和以下输出:
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Broadcasting data:
Hello world from processor MPI-PC, rank 1 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 2 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 3 out of 4 processors,
the first element of the matrix is 0.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 3 with PID 0 on node MPI-PC exited on signal 11 (Segmentation fault).
我哪里忘记正确初始化数据了?
MPI_Bcast 的第二个参数 (int count
) 不应包括数据元素大小。这是从第三个参数 (MPI_Datatype datatype
) 获得的。
所以你应该这样称呼它:
MPI_Bcast(matrix_pointer, matrix_size * matrix_size, MPI_DOUBLE, 0, MPI_COMM_WORLD);
我想在为 MPI 执行编写的代码中执行以下操作:
- 每个子进程都会创建一个包含矩阵 的 class
- 只有 属于第一个子过程的矩阵已填充
- 矩阵是 交换给所有子流程,这样所有子流程现在又一次 拥有相同的矩阵,被第一个进程修改
因此我尝试在我的代码中使用 MPI_Bcast
:
#include <iostream>
#include <mpi.h>
#include <stdio.h>
#include <armadillo>
class arma_matrix_container{
public:
arma::mat test_matrix;
arma_matrix_container(const int size, const bool first_matrix)
{
if(first_matrix)
test_matrix = arma::mat(size, size, arma::fill::ones) * 1000;
else
test_matrix = arma::mat(size, size, arma::fill::zeros);
}
void update_matrix(const arma::mat &update_matrix)
{
this->test_matrix = update_matrix;
}
double get_first_matrix_value(void)
{
double first_element = this->test_matrix(0, 0);
return first_element;
}
};
int main(int argc, char* argv[])
{
MPI_Init(&argc, &argv);
const int matrix_size = 4;
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
arma_matrix_container local_matrix(matrix_size, (world_rank == 0)?true:false);
double *matrix_pointer = local_matrix.test_matrix.memptr();
double first_element = local_matrix.get_first_matrix_value();
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("Hello world from processor %s, rank %d"
" out of %d processors,\nthe first element of the matrix is %f\n",
processor_name, world_rank, world_size, first_element);
if(world_rank == 0)
std::cout << "\nBroadcasting data: \n";
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(matrix_pointer, matrix_size * matrix_size * sizeof(double), MPI_DOUBLE, 0, MPI_COMM_WORLD);
if(world_rank == 0)
std::cout << "Data was broadcasted\n\n";
printf("Hello world from processor %s, rank %d"
" out of %d processors,\nthe first element of the matrix is %f\n",
processor_name, world_rank, world_size, first_element);
// Finalize the MPI environment.
MPI_Finalize();
}
现在,如果我删除 MPI_Bcast
-命令,我会得到(如预期的那样):
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Broadcasting data:
Hello world from processor MPI-PC, rank 3 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 1 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 2 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 3 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 1 out of 4 processors,
the first element of the matrix is 0.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Hello world from processor MPI-PC, rank 2 out of 4 processors,
the first element of the matrix is 0.000000
包括 MPI_Bcast
-命令,我得到一个段错误和以下输出:
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Broadcasting data:
Hello world from processor MPI-PC, rank 1 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 2 out of 4 processors,
the first element of the matrix is 0.000000
Hello world from processor MPI-PC, rank 3 out of 4 processors,
the first element of the matrix is 0.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
Data was broadcasted
Hello world from processor MPI-PC, rank 0 out of 4 processors,
the first element of the matrix is 1000.000000
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 3 with PID 0 on node MPI-PC exited on signal 11 (Segmentation fault).
我哪里忘记正确初始化数据了?
MPI_Bcast 的第二个参数 (int count
) 不应包括数据元素大小。这是从第三个参数 (MPI_Datatype datatype
) 获得的。
所以你应该这样称呼它:
MPI_Bcast(matrix_pointer, matrix_size * matrix_size, MPI_DOUBLE, 0, MPI_COMM_WORLD);