MPI calloc 导致段错误

MPI calloc Causes Segmentation Fault

我编写了一个程序来通过 MPI 查找数组元素的总和。 root 和 worker 都找到一部分的总和,worker 最后将部分总和发送给 root。当我尝试使用静态大小的数组时,没有任何问题。但是如果我使用 calloc,它会给出分段错误。源代码如下:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define tag1 1 /* send from root to workers */
#define tag2 2 /* send from workers to root */
#define root 0
#define n_data 12

int main(int argc, char *argv[]) 
{ 
    int total_sum, partial_sum;
    int my_id, i, n_procs, n_portion;

    MPI_Init(&argc, &argv);
    MPI_Status status;
    MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
    MPI_Comm_size(MPI_COMM_WORLD, &n_procs);
    n_portion=n_data/n_procs;

    int *array = (int *)calloc(n_data, sizeof(int));
    int *local  = (int *)calloc(n_portion, sizeof(int));

    if(my_id == root) { 

        /* initialize array */
        for(i = 0; i < n_data; i++) 
            array[i]=i;

        /* send a portion of the array to each worker */
        for(i= 1; i < n_procs; i++) 
            MPI_Send( &array[i*n_portion], n_portion, MPI_INT,i, tag1, MPI_COMM_WORLD); 

        /* calculate the sum of my portion */
        for(i = 0; i < n_portion; i++)
            total_sum += array[i];

        /* collect the partial sums from workers */
        for(i= 1; i < n_procs; i++) {
            MPI_Recv( &partial_sum, 1, MPI_INT, MPI_ANY_SOURCE,tag2, MPI_COMM_WORLD, &status);
            total_sum += partial_sum; 
        }

        printf("The total sum is: %d\n", total_sum);
    }
    else { /* I am a worker, receive data from root */

        MPI_Recv( &local, n_portion, MPI_INT, root, tag1, MPI_COMM_WORLD, &status);

        /* Calculate the sum of my portion of the array */
        partial_sum = 0;
        for(i = 0; i < n_portion; i++)
            partial_sum += local[i];

        /* send my partial sum to the root */
        MPI_Send( &partial_sum, 1, MPI_INT, root, tag2, MPI_COMM_WORLD);
    }

    MPI_Finalize(); 
    return 0;
}

我的错误是:

-bash-4.1$ mpirun -np 3 distApprox
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 110834 on node levrek1 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

感谢您的帮助。

我会说问题出在工人方面的MPI_Recv。 您应该使用 'local' 而不是 '&local' 作为缓冲区。 MPI 预计 "initial adress of the receive buffer" (see MPI standard), 在动态数组的情况下,它是数组变量本身。

MPI_Recv( local, n_portion, MPI_INT, root, tag1, MPI_COMM_WORLD, &status);

您可能还想在 root 上将 'total_sum' 初始化为 0,然后您的代码应该 运行.

编辑:刚刚看到 Martin Zabel 已经在评论中指出了这一点