为什么实际上生成并行进程的 OpenMP do-loop 不能被 OMP_GET_THREAD_NUM() 检测到?
Why OpenMP do-loop that actually generates parallel process, can NOT be detected by OMP_GET_THREAD_NUM()?
我不明白为什么!$OMP DO实际上是将任务分配给不同的线程,但是使用openMP内部函数无法检测到OMP_GET_THREAD_NUM()。
program test
implicit none
integer :: i,su
double precision a(10), b(10),c
INTEGER OMP_GET_THREAD_NUM
su=0
!$OMP DO
do i=1,10
b(i) = 10*i;
c = b(i);
write(*,*)'in the loop, rank =',c,OMP_GET_THREAD_NUM()
enddo
!$OMP END DO
!$OMP PARALLEL
write(*,*) 'Rank = ',OMP_GET_THREAD_NUM()
!$OMP END PARALLEL
end
结果是:
in the loop, rank = 10.000000000000000 0
in the loop, rank = 20.000000000000000 0
in the loop, rank = 30.000000000000000 0
in the loop, rank = 40.000000000000000 0
in the loop, rank = 50.000000000000000 0
in the loop, rank = 60.000000000000000 0
in the loop, rank = 70.000000000000000 0
in the loop, rank = 80.000000000000000 0
in the loop, rank = 90.000000000000000 0
in the loop, rank = 100.00000000000000 0
Rank = 0
Rank = 6
Rank = 1
Rank = 7
Rank = 2
Rank = 5
Rank = 4
Rank = 3
看到了吗? DO-LOOP中public好像只能看到Master线程。这是不公平的,因为这不是他唯一的贡献。
您的 do
循环不在并行区域中,因此未并行化 -- 所有循环索引 都是 由线程 0 处理的。
如果我更改您的程序以包含并行区域
...
!$OMP PARALLEL
!$OMP DO
do i=1,10
b(i) = 10*i;
c = b(i);
write(*,*)'in the loop, rank =',c,OMP_GET_THREAD_NUM()
enddo
!$OMP END DO
!$OMP END PARALLEL
...
然后我得到 OMP 线程数的正确输出:
in the loop, rank = 50.000000000000000 8
in the loop, rank = 20.000000000000000 3
in the loop, rank = 20.000000000000000 7
in the loop, rank = 20.000000000000000 4
in the loop, rank = 20.000000000000000 5
in the loop, rank = 20.000000000000000 9
in the loop, rank = 20.000000000000000 6
in the loop, rank = 20.000000000000000 0
in the loop, rank = 20.000000000000000 1
in the loop, rank = 30.000000000000000 2
这个特定的输出也暴露了您的代码中的一个缺陷,即 c
是共享的,所以它的值被每个线程破坏了。此外,如果 do
循环是并行区域中唯一的东西,您可以组合 OMP 指令。最后,如果我们将您的代码更改为:
!$OMP PARALLEL DO private(c)
do i=1,10
b(i) = 10*i;
c = b(i);
write(*,*)'in the loop, rank =',c,OMP_GET_THREAD_NUM()
enddo
!$OMP END PARALLEL DO
那么输出就正确了
in the loop, rank = 100.00000000000000 9
in the loop, rank = 30.000000000000000 2
in the loop, rank = 20.000000000000000 1
in the loop, rank = 70.000000000000000 6
in the loop, rank = 60.000000000000000 5
in the loop, rank = 50.000000000000000 4
in the loop, rank = 80.000000000000000 7
in the loop, rank = 90.000000000000000 8
in the loop, rank = 10.000000000000000 0
in the loop, rank = 40.000000000000000 3
我不明白为什么!$OMP DO实际上是将任务分配给不同的线程,但是使用openMP内部函数无法检测到OMP_GET_THREAD_NUM()。
program test
implicit none
integer :: i,su
double precision a(10), b(10),c
INTEGER OMP_GET_THREAD_NUM
su=0
!$OMP DO
do i=1,10
b(i) = 10*i;
c = b(i);
write(*,*)'in the loop, rank =',c,OMP_GET_THREAD_NUM()
enddo
!$OMP END DO
!$OMP PARALLEL
write(*,*) 'Rank = ',OMP_GET_THREAD_NUM()
!$OMP END PARALLEL
end
结果是:
in the loop, rank = 10.000000000000000 0
in the loop, rank = 20.000000000000000 0
in the loop, rank = 30.000000000000000 0
in the loop, rank = 40.000000000000000 0
in the loop, rank = 50.000000000000000 0
in the loop, rank = 60.000000000000000 0
in the loop, rank = 70.000000000000000 0
in the loop, rank = 80.000000000000000 0
in the loop, rank = 90.000000000000000 0
in the loop, rank = 100.00000000000000 0
Rank = 0
Rank = 6
Rank = 1
Rank = 7
Rank = 2
Rank = 5
Rank = 4
Rank = 3
看到了吗? DO-LOOP中public好像只能看到Master线程。这是不公平的,因为这不是他唯一的贡献。
您的 do
循环不在并行区域中,因此未并行化 -- 所有循环索引 都是 由线程 0 处理的。
如果我更改您的程序以包含并行区域
...
!$OMP PARALLEL
!$OMP DO
do i=1,10
b(i) = 10*i;
c = b(i);
write(*,*)'in the loop, rank =',c,OMP_GET_THREAD_NUM()
enddo
!$OMP END DO
!$OMP END PARALLEL
...
然后我得到 OMP 线程数的正确输出:
in the loop, rank = 50.000000000000000 8
in the loop, rank = 20.000000000000000 3
in the loop, rank = 20.000000000000000 7
in the loop, rank = 20.000000000000000 4
in the loop, rank = 20.000000000000000 5
in the loop, rank = 20.000000000000000 9
in the loop, rank = 20.000000000000000 6
in the loop, rank = 20.000000000000000 0
in the loop, rank = 20.000000000000000 1
in the loop, rank = 30.000000000000000 2
这个特定的输出也暴露了您的代码中的一个缺陷,即 c
是共享的,所以它的值被每个线程破坏了。此外,如果 do
循环是并行区域中唯一的东西,您可以组合 OMP 指令。最后,如果我们将您的代码更改为:
!$OMP PARALLEL DO private(c)
do i=1,10
b(i) = 10*i;
c = b(i);
write(*,*)'in the loop, rank =',c,OMP_GET_THREAD_NUM()
enddo
!$OMP END PARALLEL DO
那么输出就正确了
in the loop, rank = 100.00000000000000 9
in the loop, rank = 30.000000000000000 2
in the loop, rank = 20.000000000000000 1
in the loop, rank = 70.000000000000000 6
in the loop, rank = 60.000000000000000 5
in the loop, rank = 50.000000000000000 4
in the loop, rank = 80.000000000000000 7
in the loop, rank = 90.000000000000000 8
in the loop, rank = 10.000000000000000 0
in the loop, rank = 40.000000000000000 3