Provided by: scalapack-doc_1.5-10_all 

NAME
PSSTEIN - compute the eigenvectors of a symmetric tridiagonal matrix in parallel, using inverse iteration
SYNOPSIS
SUBROUTINE PSSTEIN( N, D, E, M, W, IBLOCK, ISPLIT, ORFAC, Z, IZ, JZ, DESCZ, WORK, LWORK, IWORK, LIWORK,
IFAIL, ICLUSTR, GAP, INFO )
INTEGER INFO, IZ, JZ, LIWORK, LWORK, M, N
REAL ORFAC
INTEGER DESCZ( * ), IBLOCK( * ), ICLUSTR( * ), IFAIL( * ), ISPLIT( * ), IWORK( * )
REAL D( * ), E( * ), GAP( * ), W( * ), WORK( * ), Z( * )
PURPOSE
PSSTEIN computes the eigenvectors of a symmetric tridiagonal matrix in parallel, using inverse iteration.
The eigenvectors found correspond to user specified eigenvalues. PSSTEIN does not orthogonalize vectors
that are on different processes. The extent of orthogonalization is controlled by the input parameter
LWORK. Eigenvectors that are to be orthogonalized are computed by the same process. PSSTEIN decides on
the allocation of work among the processes and then calls SSTEIN2 (modified LAPACK routine) on each
individual process. If insufficient workspace is allocated, the expected orthogonalization may not be
done.
Note : If the eigenvectors obtained are not orthogonal, increase
LWORK and run the code again.
Notes
=====
Each global data object is described by an associated description vector. This vector stores the
information required to establish the mapping between an object element and its corresponding process and
memory location.
Let A be a generic term for any 2D block cyclicly distributed array. Such a global array has an
associated description vector DESCA. In the following comments, the character _ should be read as "of
the global array".
NOTATION STORED IN EXPLANATION
--------------- -------------- -------------------------------------- DTYPE_A(global) DESCA( DTYPE_ )The
descriptor type. In this case,
DTYPE_A = 1.
CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
the BLACS process grid A is distribu-
ted over. The context itself is glo-
bal, but the handle (the integer
value) may vary.
M_A (global) DESCA( M_ ) The number of rows in the global
array A.
N_A (global) DESCA( N_ ) The number of columns in the global
array A.
MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
the rows of the array.
NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
the columns of the array.
RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
row of the array A is distributed. CSRC_A (global) DESCA( CSRC_ ) The
process column over which the
first column of the array A is
distributed.
LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
array. LLD_A >= MAX(1,LOCr(M_A)).
Let K be the number of rows or columns of a distributed matrix, and assume that its process grid has
dimension p x q.
LOCr( K ) denotes the number of elements of K that a process would receive if K were distributed over the
p processes of its process column.
Similarly, LOCc( K ) denotes the number of elements of K that a process would receive if K were
distributed over the q processes of its process row.
The values of LOCr() and LOCc() may be determined via a call to the ScaLAPACK tool function, NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upper bound for these quantities may be
computed by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
ARGUMENTS
P = NPROW * NPCOL is the total number of processes
N (global input) INTEGER
The order of the tridiagonal matrix T. N >= 0.
D (global input) REAL array, dimension (N)
The n diagonal elements of the tridiagonal matrix T.
E (global input) REAL array, dimension (N-1)
The (n-1) off-diagonal elements of the tridiagonal matrix T.
M (global input) INTEGER
The total number of eigenvectors to be found. 0 <= M <= N.
W (global input/global output) REAL array, dim (M)
On input, the first M elements of W contain all the eigenvalues for which eigenvectors are to be
computed. The eigenvalues should be grouped by split-off block and ordered from smallest to
largest within the block (The output array W from PSSTEBZ with ORDER='b' is expected here). This
array should be replicated on all processes. On output, the first M elements contain the input
eigenvalues in ascending order.
Note : To obtain orthogonal vectors, it is best if eigenvalues are computed to highest accuracy (
this can be done by setting ABSTOL to the underflow threshold = SLAMCH('U') --- ABSTOL is an
input parameter to PSSTEBZ )
IBLOCK (global input) INTEGER array, dimension (N)
The submatrix indices associated with the corresponding eigenvalues in W -- 1 for eigenvalues
belonging to the first submatrix from the top, 2 for those belonging to the second submatrix,
etc. (The output array IBLOCK from PSSTEBZ is expected here).
ISPLIT (global input) INTEGER array, dimension (N)
The splitting points, at which T breaks up into submatrices. The first submatrix consists of
rows/columns 1 to ISPLIT(1), the second of rows/columns ISPLIT(1)+1 through ISPLIT(2), etc., and
the NSPLIT-th consists of rows/columns ISPLIT(NSPLIT-1)+1 through ISPLIT(NSPLIT)=N (The output
array ISPLIT from PSSTEBZ is expected here.)
ORFAC (global input) REAL
ORFAC specifies which eigenvectors should be orthogonalized. Eigenvectors that correspond to
eigenvalues which are within ORFAC*||T|| of each other are to be orthogonalized. However, if the
workspace is insufficient (see LWORK), this tolerance may be decreased until all eigenvectors to
be orthogonalized can be stored in one process. No orthogonalization will be done if ORFAC
equals zero. A default value of 10^-3 is used if ORFAC is negative. ORFAC should be identical
on all processes.
Z (local output) REAL array,
dimension (DESCZ(DLEN_), N/npcol + NB) Z contains the computed eigenvectors associated with the
specified eigenvalues. Any vector which fails to converge is set to its current iterate after
MAXITS iterations ( See SSTEIN2 ). On output, Z is distributed across the P processes in block
cyclic format.
IZ (global input) INTEGER
Z's global row index, which points to the beginning of the submatrix which is to be operated on.
JZ (global input) INTEGER
Z's global column index, which points to the beginning of the submatrix which is to be operated
on.
DESCZ (global and local input) INTEGER array of dimension DLEN_.
The array descriptor for the distributed matrix Z.
WORK (local workspace/global output) REAL array,
dimension ( LWORK ) On output, WORK(1) gives a lower bound on the workspace ( LWORK ) that
guarantees the user desired orthogonalization (see ORFAC). Note that this may overestimate the
minimum workspace needed.
LWORK (local input) integer
LWORK controls the extent of orthogonalization which can be done. The number of eigenvectors for
which storage is allocated on each process is NVEC = floor(( LWORK- max(5*N,NP00*MQ00) )/N).
Eigenvectors corresponding to eigenvalue clusters of size NVEC - ceil(M/P) + 1 are guaranteed to
be orthogonal ( the orthogonality is similar to that obtained from SSTEIN2). Note : LWORK must
be no smaller than: max(5*N,NP00*MQ00) + ceil(M/P)*N, and should have the same input value on all
processes. It is the minimum value of LWORK input on different processes that is significant.
If LWORK = -1, then LWORK is global input and a workspace query is assumed; the routine only
calculates the minimum and optimal size for all work arrays. Each of these values is returned in
the first entry of the corresponding work array, and no error message is issued by PXERBLA.
IWORK (local workspace/global output) INTEGER array,
dimension ( 3*N+P+1 ) On return, IWORK(1) contains the amount of integer workspace required. On
return, the IWORK(2) through IWORK(P+2) indicate the eigenvectors computed by each process.
Process I computes eigenvectors indexed IWORK(I+2)+1 thru' IWORK(I+3).
LIWORK (local input) INTEGER
Size of array IWORK. Must be >= 3*N + P + 1
If LIWORK = -1, then LIWORK is global input and a workspace query is assumed; the routine only
calculates the minimum and optimal size for all work arrays. Each of these values is returned in
the first entry of the corresponding work array, and no error message is issued by PXERBLA.
IFAIL (global output) integer array, dimension (M)
On normal exit, all elements of IFAIL are zero. If one or more eigenvectors fail to converge
after MAXITS iterations (as in SSTEIN), then INFO > 0 is returned. If mod(INFO,M+1)>0, then for
I=1 to mod(INFO,M+1), the eigenvector corresponding to the eigenvalue W(IFAIL(I)) failed to
converge ( W refers to the array of eigenvalues on output ).
ICLUSTR (global output) integer array, dimension (2*P) This output array contains indices of
eigenvectors corresponding to a cluster of eigenvalues that could not be orthogonalized due to
insufficient workspace (see LWORK, ORFAC and INFO). Eigenvectors corresponding to clusters of
eigenvalues indexed ICLUSTR(2*I-1) to ICLUSTR(2*I), I = 1 to INFO/(M+1), could not be
orthogonalized due to lack of workspace. Hence the eigenvectors corresponding to these clusters
may not be orthogonal. ICLUSTR is a zero terminated array --- ( ICLUSTR(2*K).NE.0 .AND.
ICLUSTR(2*K+1).EQ.0 ) if and only if K is the number of clusters.
GAP (global output) REAL array, dimension (P)
This output array contains the gap between eigenvalues whose eigenvectors could not be
orthogonalized. The INFO/M output values in this array correspond to the INFO/(M+1) clusters
indicated by the array ICLUSTR. As a result, the dot product between eigenvectors corresponding
to the I^th cluster may be as high as ( O(n)*macheps ) / GAP(I).
INFO (global output) INTEGER
= 0: successful exit
< 0: If the i-th argument is an array and the j-entry had an illegal value, then INFO =
-(i*100+j), if the i-th argument is a scalar and had an illegal value, then INFO = -i. < 0 : if
INFO = -I, the I-th argument had an illegal value
> 0 : if mod(INFO,M+1) = I, then I eigenvectors failed to converge in MAXITS iterations. Their
indices are stored in the array IFAIL. if INFO/(M+1) = I, then eigenvectors corresponding to I
clusters of eigenvalues could not be orthogonalized due to insufficient workspace. The indices of
the clusters are stored in the array ICLUSTR.
LAPACK version 1.5 12 May 1997 PSSTEIN(l)