如何在 LAPACK 中找到最佳块大小和 LWORK
How to find optimal block size and LWORK in LAPACK
我正在尝试使用带有 lapack 的 Fortran 来查找 nxn Hermitian 矩阵的逆函数和特征函数。
如何为 lda
、lwork
、liwork
和 lrwork
等参数选择最佳值。我浏览了一些示例并找到了这些选项
integer,parameter::lda=nh
integer,parameter::lwork=2*nh+nh*nh
integer,parameter::liwork=3+5*nh
integer,parameter::lrwork=1 + 5*nh + 2*nh*nh
其中 nh
是矩阵的维数。我还找到了另一个 lwork=16*nh
的例子。我如何确定最佳选择?在这一点上,我正在处理 500x500 Hermitian 矩阵(最大)。
我发现 this documentation 这表明
WORK
(workspace) REAL array, dimension (LWORK)
On exit, if INFO = 0, then WORK(1) returns the optimal LWORK.
LWORK
(input) INTEGER
The dimension of the array WORK. LWORK  max(1,N).
For optimal performance LWORK  N*NB, where NB is the optimal block size returned by ILAENV.
对于给定的矩阵维度,是否可以使用 WORK
或 ILAENV
找出最佳块大小?
我将 gfortran 和 ifort 与 mkl 一起使用。
编辑
根据@percusse 和 的评论,这里有一个示例代码
character,parameter::jobz="v",uplo="u"
integer, parameter::nh=15
complex*16::m(nh,nh),m1(nh,nh)
integer,parameter::lda=nh
integer::ipiv(nh),info
complex*16::work(1)
real*8::rwork(1), w(nh)
integer::iwork(1)
real*8::x1(nh,nh),x2(nh,nh)
call random_seed()
call random_number(x1)
call random_number(x2)
m=cmplx(x1,x2)
m1=conjg(m)
m1=transpose(m1)
m=(m+m1)/2.0
call zheevd(jobz,uplo,nh,m,lda,w,work,-1,rwork,-1,iwork, -1,info)
print*,"info : ", info
print*,"lwork: ", int(work(1)) , 2*nh+nh*nh
print*,"lrwork:", int(rwork(1)) , 1 + 5*nh + 2*nh*nh
print*,"liwork:", int(iwork(1)) , 3+5*nh
end
info : 0
lwork: 255 255
lrwork: 526 526
liwork: 78 78
我不确定你用 "Is it possible to find out the optimal block size using WORK
or ILAENV
for a particular machine architecture?" 暗示什么。但是,您可以找到特定问题的最佳值。
例如。如果您想使用 cheev
找到复杂的 Hermitian 矩阵的特征值,您可以要求例程 return 您的值 :
subroutine CHEEV( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, RWORK, INFO )
character , intent(in) :: JOBZ
character , intent(in) :: UPLO
integer , intent(in) :: N
complex, dimension(lda,*), intent(inout) :: A
integer , intent(in) :: LDA
real , dimension(*) , intent(out) :: W
complex, dimension(*) , intent(out) :: WORK
integer , intent(in) :: LWORK
real , dimension(*) , intent(out) :: RWORK
integer , intent(out) :: INFO
然后 documentation 明确指出(请注意,过去这更容易阅读):
WORK
is COMPLEX
array, dimension (MAX(1,LWORK))
On exit, if INFO = 0
, WORK(1)
returns the optimal LWORK
.
LWORK
is INTEGER
The length of the array WORK
. LWORK >= max(1,2*N-1)
.
For optimal efficiency, LWORK >= (NB+1)*N
,
where NB
is the blocksize for CHETRD
returned by ILAENV
. If LWORK = -1
, then a workspace query is assumed; the routine
only calculates the optimal size of the WORK
array, returns
this value as the first entry of the WORK
array, and no error
message related to LWORK
is issued by XERBLA
.
所以你需要做的就是
call cheev(jobz, uplo, n, a, lda, w, work, -1, rwork, info)
lwork=int(work(1))
dallocate(work)
allocate(work(lwork))
call cheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)
我正在尝试使用带有 lapack 的 Fortran 来查找 nxn Hermitian 矩阵的逆函数和特征函数。
如何为 lda
、lwork
、liwork
和 lrwork
等参数选择最佳值。我浏览了一些示例并找到了这些选项
integer,parameter::lda=nh
integer,parameter::lwork=2*nh+nh*nh
integer,parameter::liwork=3+5*nh
integer,parameter::lrwork=1 + 5*nh + 2*nh*nh
其中 nh
是矩阵的维数。我还找到了另一个 lwork=16*nh
的例子。我如何确定最佳选择?在这一点上,我正在处理 500x500 Hermitian 矩阵(最大)。
我发现 this documentation 这表明
WORK
(workspace) REAL array, dimension (LWORK)
On exit, if INFO = 0, then WORK(1) returns the optimal LWORK.
LWORK
(input) INTEGER
The dimension of the array WORK. LWORK  max(1,N).
For optimal performance LWORK  N*NB, where NB is the optimal block size returned by ILAENV.
对于给定的矩阵维度,是否可以使用 WORK
或 ILAENV
找出最佳块大小?
我将 gfortran 和 ifort 与 mkl 一起使用。
编辑
根据@percusse 和
character,parameter::jobz="v",uplo="u"
integer, parameter::nh=15
complex*16::m(nh,nh),m1(nh,nh)
integer,parameter::lda=nh
integer::ipiv(nh),info
complex*16::work(1)
real*8::rwork(1), w(nh)
integer::iwork(1)
real*8::x1(nh,nh),x2(nh,nh)
call random_seed()
call random_number(x1)
call random_number(x2)
m=cmplx(x1,x2)
m1=conjg(m)
m1=transpose(m1)
m=(m+m1)/2.0
call zheevd(jobz,uplo,nh,m,lda,w,work,-1,rwork,-1,iwork, -1,info)
print*,"info : ", info
print*,"lwork: ", int(work(1)) , 2*nh+nh*nh
print*,"lrwork:", int(rwork(1)) , 1 + 5*nh + 2*nh*nh
print*,"liwork:", int(iwork(1)) , 3+5*nh
end
info : 0
lwork: 255 255
lrwork: 526 526
liwork: 78 78
我不确定你用 "Is it possible to find out the optimal block size using WORK
or ILAENV
for a particular machine architecture?" 暗示什么。但是,您可以找到特定问题的最佳值。
例如。如果您想使用 cheev
找到复杂的 Hermitian 矩阵的特征值,您可以要求例程 return 您的值 :
subroutine CHEEV( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, RWORK, INFO )
character , intent(in) :: JOBZ
character , intent(in) :: UPLO
integer , intent(in) :: N
complex, dimension(lda,*), intent(inout) :: A
integer , intent(in) :: LDA
real , dimension(*) , intent(out) :: W
complex, dimension(*) , intent(out) :: WORK
integer , intent(in) :: LWORK
real , dimension(*) , intent(out) :: RWORK
integer , intent(out) :: INFO
然后 documentation 明确指出(请注意,过去这更容易阅读):
WORK
isCOMPLEX
array, dimension(MAX(1,LWORK))
On exit, ifINFO = 0
,WORK(1)
returns the optimalLWORK
.
LWORK
isINTEGER
The length of the arrayWORK
.LWORK >= max(1,2*N-1)
. For optimal efficiency,LWORK >= (NB+1)*N
, whereNB
is the blocksize forCHETRD
returned byILAENV
. IfLWORK = -1
, then a workspace query is assumed; the routine only calculates the optimal size of theWORK
array, returns this value as the first entry of theWORK
array, and no error message related toLWORK
is issued byXERBLA
.
所以你需要做的就是
call cheev(jobz, uplo, n, a, lda, w, work, -1, rwork, info)
lwork=int(work(1))
dallocate(work)
allocate(work(lwork))
call cheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)