在 MPI 中拆分和传递数组块
Splitting and Passing Array Blocks in MPI
我是 MPI 的新手,我试图通过编写一个简单的 C 程序来理解它的含义。我想做的就是拆分一个数组并将块发送到 N 个处理器。因此,每个处理器都会在他们的块中找到本地最小值。然后程序(在根目录或其他地方)找到全局最小值。
我研究了 MPI_Send
、MPI_Isend
或 MPI_Bcast
函数,但对在哪里使用一个函数而不是另一个函数有点困惑。我需要一些有关我的程序的一般结构的提示:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define N 9 // array size
int A[N] = {0,2,1,5,4,3,7,6,8}; // this is a dummy array
int main(int argc, char *argv[]) {
int i, k = 0, size, rank, source = 0, dest = 1, count;
int tag = 1234;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
count = N/(size-1); // think size = 4 for this example
int *tempArray = malloc(count * sizeof(int));
int *localMins = malloc((size-1) * sizeof(int));
if (rank == 0) {
for(i=0; i<size; i+=count)
{
// Is it better to use MPI_Isend or MPI_Bcast here?
MPI_Send(&A[i], count, MPI_INT, dest, tag, MPI_COMM_WORLD);
printf("P0 sent a %d elements to P%d.\n", count, dest);
dest++;
}
}
else {
for(i=0; i<size; i+=count)
{
MPI_Recv(tempArray, count, MPI_INT, 0, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
localMins[k] = findMin(tempArray, count);
printf("Min for P%d is %d.\n", rank, localMins[k]);
k++;
}
}
MPI_Finalize();
int gMin = findMin(localMins, (size-1)); // where should I assign this
printf("Global min: %d\n", gMin); // and where should I print the results?
return 0;
}
我的代码中可能存在多个错误,很抱歉无法在此处指定确切的问题。感谢您的任何建议。
您的代码存在几个问题(正如您已经指出的那样),并且正如一些评论者已经提到的那样,有其他方法可以使用 MPI 调用来执行您尝试执行的操作。
但是,我将重新调整您的代码的用途,尽量不要更改太多,以便向您展示发生了什么。
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define N 9 // array size
int A[N] = {0,2,1,5,4,3,7,6,8}; // this is a dummy array that should only be initialized on rank == ROOT
int main(int argc, char *argv[]) {
int size;
int rank;
const int VERY_LARGE_INT = 999999;
const int ROOT = 0; // the master rank that holds A to begin with
int tag = 1234;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size); // think size = 4 for this example
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
/*
How many numbers you send from ROOT to each other rank.
Note that for this implementation to work, (size-1) must divide N.
*/
int count = N/(size-1);
int *localArray = (int *)malloc(count * sizeof(int));
int localMin; // minimum computed on rank i
int globalMin; // will only be valid on rank == ROOT
/* rank == ROOT sends portion of A to every other rank */
if (rank == ROOT) {
for(int dest = 1; dest < size; ++dest)
{
// If you are sending information from one rank to another, you use MPI_Send or MPI_Isend.
// If you are sending information from one rank to ALL others, then every rank must call MPI_Bcast (similar to MPI_Reduce below)
MPI_Send(&A[(dest-1)*count], count, MPI_INT, dest, tag, MPI_COMM_WORLD);
printf("P0 sent a %d elements to P%d.\n", count, dest);
}
localMin = VERY_LARGE_INT; // needed for MPI_Reduce below
}
/* Every other rank is receiving one message: from ROOT into local array */
else {
MPI_Recv(localArray, count, MPI_INT, ROOT, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
localMin = findMin(localArray, count);
printf("Min for P%d is %d.\n", rank, localMin);
}
/*
At this point, every rank in communicator has valid information stored in localMin.
Use MPI_Reduce in order to find the global min among all ranks.
Store this single globalMin on rank == ROOT.
*/
MPI_Reduce(&localMin, &globalMin, 1, MPI_INT, MPI_MIN, ROOT, MPI_COMM_WORLD);
if (rank == ROOT)
printf("Global min: %d\n", globalMin);
/* The last thing you do is Finalize MPI. Nothing should come after. */
MPI_Finalize();
return 0;
}
完全披露:我没有测试过这段代码,但除了轻微的拼写错误外,它应该可以工作。
查看此代码,看看您是否能理解为什么我移动了您的 MPI_Send
和 MPI_Recv
调用。要理解这一点,请注意每个级别都在阅读您提供的每一行代码。因此,在您的 else
语句中,不应有 for
接收循环。
此外,MPI 集合(例如 MPI_Reduce
和 MPI_Bcast
)必须由通信器中的每个级别调用。这些调用的 "source" 和 "destination" 等级是函数输入参数的一部分或由集合本身隐含。
最后,给你做一点作业:你能看出为什么这不是查找数组全局最小值的好实现吗 A
?提示:rank == ROOT
在完成其 MPI_Send
之后正在做什么?您如何更好地拆分这个问题,以便每个等级的工作表现更均匀?
我是 MPI 的新手,我试图通过编写一个简单的 C 程序来理解它的含义。我想做的就是拆分一个数组并将块发送到 N 个处理器。因此,每个处理器都会在他们的块中找到本地最小值。然后程序(在根目录或其他地方)找到全局最小值。
我研究了 MPI_Send
、MPI_Isend
或 MPI_Bcast
函数,但对在哪里使用一个函数而不是另一个函数有点困惑。我需要一些有关我的程序的一般结构的提示:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define N 9 // array size
int A[N] = {0,2,1,5,4,3,7,6,8}; // this is a dummy array
int main(int argc, char *argv[]) {
int i, k = 0, size, rank, source = 0, dest = 1, count;
int tag = 1234;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
count = N/(size-1); // think size = 4 for this example
int *tempArray = malloc(count * sizeof(int));
int *localMins = malloc((size-1) * sizeof(int));
if (rank == 0) {
for(i=0; i<size; i+=count)
{
// Is it better to use MPI_Isend or MPI_Bcast here?
MPI_Send(&A[i], count, MPI_INT, dest, tag, MPI_COMM_WORLD);
printf("P0 sent a %d elements to P%d.\n", count, dest);
dest++;
}
}
else {
for(i=0; i<size; i+=count)
{
MPI_Recv(tempArray, count, MPI_INT, 0, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
localMins[k] = findMin(tempArray, count);
printf("Min for P%d is %d.\n", rank, localMins[k]);
k++;
}
}
MPI_Finalize();
int gMin = findMin(localMins, (size-1)); // where should I assign this
printf("Global min: %d\n", gMin); // and where should I print the results?
return 0;
}
我的代码中可能存在多个错误,很抱歉无法在此处指定确切的问题。感谢您的任何建议。
您的代码存在几个问题(正如您已经指出的那样),并且正如一些评论者已经提到的那样,有其他方法可以使用 MPI 调用来执行您尝试执行的操作。
但是,我将重新调整您的代码的用途,尽量不要更改太多,以便向您展示发生了什么。
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define N 9 // array size
int A[N] = {0,2,1,5,4,3,7,6,8}; // this is a dummy array that should only be initialized on rank == ROOT
int main(int argc, char *argv[]) {
int size;
int rank;
const int VERY_LARGE_INT = 999999;
const int ROOT = 0; // the master rank that holds A to begin with
int tag = 1234;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size); // think size = 4 for this example
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
/*
How many numbers you send from ROOT to each other rank.
Note that for this implementation to work, (size-1) must divide N.
*/
int count = N/(size-1);
int *localArray = (int *)malloc(count * sizeof(int));
int localMin; // minimum computed on rank i
int globalMin; // will only be valid on rank == ROOT
/* rank == ROOT sends portion of A to every other rank */
if (rank == ROOT) {
for(int dest = 1; dest < size; ++dest)
{
// If you are sending information from one rank to another, you use MPI_Send or MPI_Isend.
// If you are sending information from one rank to ALL others, then every rank must call MPI_Bcast (similar to MPI_Reduce below)
MPI_Send(&A[(dest-1)*count], count, MPI_INT, dest, tag, MPI_COMM_WORLD);
printf("P0 sent a %d elements to P%d.\n", count, dest);
}
localMin = VERY_LARGE_INT; // needed for MPI_Reduce below
}
/* Every other rank is receiving one message: from ROOT into local array */
else {
MPI_Recv(localArray, count, MPI_INT, ROOT, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
localMin = findMin(localArray, count);
printf("Min for P%d is %d.\n", rank, localMin);
}
/*
At this point, every rank in communicator has valid information stored in localMin.
Use MPI_Reduce in order to find the global min among all ranks.
Store this single globalMin on rank == ROOT.
*/
MPI_Reduce(&localMin, &globalMin, 1, MPI_INT, MPI_MIN, ROOT, MPI_COMM_WORLD);
if (rank == ROOT)
printf("Global min: %d\n", globalMin);
/* The last thing you do is Finalize MPI. Nothing should come after. */
MPI_Finalize();
return 0;
}
完全披露:我没有测试过这段代码,但除了轻微的拼写错误外,它应该可以工作。
查看此代码,看看您是否能理解为什么我移动了您的 MPI_Send
和 MPI_Recv
调用。要理解这一点,请注意每个级别都在阅读您提供的每一行代码。因此,在您的 else
语句中,不应有 for
接收循环。
此外,MPI 集合(例如 MPI_Reduce
和 MPI_Bcast
)必须由通信器中的每个级别调用。这些调用的 "source" 和 "destination" 等级是函数输入参数的一部分或由集合本身隐含。
最后,给你做一点作业:你能看出为什么这不是查找数组全局最小值的好实现吗 A
?提示:rank == ROOT
在完成其 MPI_Send
之后正在做什么?您如何更好地拆分这个问题,以便每个等级的工作表现更均匀?