Openmp c++: error: collapsed loops not perfectly nested

Question

我有以下串行代码，我想将其并行化。我了解在嵌套循环中使用折叠子句时，重要的是在 for(i) 循环之前和之后没有代码，因为这是不允许的。那么我如何将嵌套的 for 循环与 if 语句并行，如下所示：

void foo2D(double Lx, double Ly, double KX[], double KY[], double ksqu[], double ninvksqu[], int FourierMeshType){
    
// Make a variable to account for way wavenumbers are set in FFTW3. 
    
int k_counter = 0;      
    
// kx corresponds to modes [0:n/2-1 , -n/2:-1]. This is why there is an additional step, just due to FFTW3's structuring
    #pragma omp parallel for collapse(2) 
    for(int i = 0; i < nx ; i++){
        for(int j = 0; j < nyk; j++){           
            if (i < nx/2){ // i<nx/2 --> 0 to 127
                KX[j + nyk*i] =2.*M_PI*i/Lx; //K = 2pi/L* [0:nx/2-1 , -n/2:-1]' : creates a column vector with nx/2 elements starting from 0 ( from 0 to 127 then -128 to -1) 256/2 is 128      
            }
            if( i >= nx/2){ // i >= nx/2 --> from -128 to -1
                KX[j + nyk*i] =  2.* M_PI * (-i + 2.*k_counter) / Lx;           
            }
        }
        if( i >= nx/2){ // error here 
            k_counter++;
        }
    }
}

其中 nx, ny, nyk 定义为：

static const int nx = 256;
static const int ny = 256;
static const int nyk = ny/2 + 1;

有没有更好的方法重写这个？

Answer 1

正如 1201ProgramAlarm 在评论中指出的那样，您可以通过消除两个循环之间存在的 if 分支来消除错误：

#pragma omp parallel for collapse(2) 
for(int i = 0; i < nx ; i++){
    for(int j = 0; j < nyk; j++){    
       ...       
    }
    if( i >= nx/2){ <== remove this
        k_counter++;
    }
}

这个：

if( i >= nx/2){ <== remove this
    k_counter++;
}

可以预先静态计算。所以您的代码将如下所示：

void makeFourierMesh2D(double Lx, double Ly, double KX[], double KY[], double ksqu[], double ninvksqu[], int FourierMeshType){
    
    #pragma omp parallel for collapse(2) 
    for(int i = 0; i < nx ; i++){
        for(int j = 0; j < nyk; j++){           
            if (i < nx/2){
                KX[j + nyk*i] =2.*M_PI*i/Lx; 
            }
            if( i >= nx/2){
                int k_counter = i - nx/2;
                KX[j + nyk*i] =  2.* M_PI * (-i + 2.*k_counter) / Lx;           
            }
        }
    }
}

然后可以将(-i + 2.*k_counter)简化为(-i + 2 * ( i - nx/2))，再简化为(-i + 2*i - nx)，即i - nx.

要进一步删除其他循环，您只需在一对单独的循环中计算每个分支，如下所示：

    #pragma omp parallel for collapse(2) 
    for(int i = 0; i < nx/2 ; i++){
        for(int j = 0; j < nyk; j++){           
             KX[j + nyk*i] =2.*M_PI*i/Lx; 
        }
    }

    #pragma omp parallel for collapse(2) 
    for(int i = nx/2; i < nx ; i++){
        for(int j = 0; j < nyk; j++){           
             KX[j + nyk*i] =  2.* M_PI * ((i - nx) / Lx);    
        }
    }

如前所述，您还可以将 (-i + 2.*k_counter) 简化为 i - nx。

Openmp c++: error: collapsed loops not perfectly nested

Openmp c++: error: collapsed loops not perfectly nested

c++

parallel-processing

multithreading

openmp

nested-loops