fdvmLDe (1158420), страница 11
Текст из файла (страница 11)
| shadow-wait-directive | is SHADOW_WAIT shadow-group-name |
Constraints:
-
SHADOW_START directive must be executed after SHADOW_GROUP one.
-
SHADOW_WAITT directive must be executed after SHADOW_START one.
-
Updated values of the shadow edges may be used only after SHADOW_WAIT directive.
-
These directives may not be used within a parallel loop.
6.3.1. REMOTE_ACCESS Directive
| remote-access-directive | is REMOTE_ACCESS ( [ remote-group-name : ] regular-reference-list) |
| regular-reference | is dist-array-name [( regular-subscript-list )] |
| regular-subscript | is int-expr |
| or do-variable-use | |
| or : | |
| remote-access-clause | is remote-access-directive |
6.3.3. Asynchronous specification of REMOTE type references
| remote-group-directive | is REMOTE_GROUP remote-group-name-list |
Constraint:
-
The identifier, specified in the directive, can be used only in REMOTE_ACCESS, PREFETCH and RESET directives.
| prefetch-directive | is PREFETCH remote-group-name |
| reset-directive | is RESET remote-group-name |
Constraints.
-
Repeated usage of PREFETCH directive is correct, if the remote reference group characteristics (the loop parameters, distribution of arrays and the values of index expressions in remote references) are not updated.
-
PREFETCH directive can be executed for the several loops (several REMOTE_ACCESS directives), if there are no data dependencies between the loops for distributed arrays specified in the REMOTE_ACCESS directives.
6.4. Asynchronous group reduction
| reduction-group-directive | is REDUCTION_GROUP reduction-group-name-list |
| reduction-start-directive | is REDUCTION_START reduction-group-name |
| reduction-wait-directive | is REDUCTION_WAIT reduction-group-name |
Constraints.
-
Before executing REDUCTION_START directive, the reduction variables of the group may be used in reduction statements of parallel loops only.
-
REDUCTION_START and REDUCTION_WAIT directives must be executed after the completion of the loop (loops) where the values of the reduction variables are calculated. The only statements allowed between these directives are those that are not using the values of the reduction variables.
-
REDUCTION_WAIT directive deletes the group of the reduction operations.
7.1. Description of task array
| task-directive | is TASK task-list |
| task | is task-name ( max-task ) |
7.2. Mapping tasks on processors. MAP directive
| map-directive | is MAP task-name ( task-index ) |
| ONTO processors-name( section-subscript-list)) |
7.4. Distribution of computations. TASK_REGION directive
| block-task-region | is task-region-directive | |
| on-block | ||
| [ on-block ]... | ||
| end-task-region-directive | ||
| task-region-directive | is TASK_REGION task-name [ , reduction-clause ] | |
| end-task-region-directive | is END TASK_REGION | |
| on-block | is on-directive | |
| block | ||
| end-on-directive | ||
| on-directive | is ON task-name ( task-index ) [ , new-clause ] | |
| end-on-directive | is END ON | |
| loop-task-region | is task-region-directive | |
| parallel-task-loop | ||
| end-task-region-directive | ||
| parallel-task-loop | is parallel-task-loop-directive | |
| do-loop | ||
| parallel-task-loop-directive | is PARALLEL ( do-variable ) ON task-name ( do-variable ) [ , new-clause ] | |
9. Procedures
| inherit-directive | is INHERIT dummy-array-name-list |
Annex2. Code examples
Seven small scientific programs are presented to illustrate Fortran DVM language features. They are intended for solving a system of linear equations:
A x = b
where A - matrix of coefficients,
b - vector of free members,
x - vector of unknowns.
The following basic methods are used for solving this system.
Direct methods. The well-known Gaussian Elimination method is the most commonly used algorithm of this class. The main idea of this algorithm is to reduce the matrix A to upper triangular form and then to use backward substitution to diagonalize the matrix.
Explicit iteration methods. Jacobi Relaxation is the most known algorithm of this class. The algorithm perform the following computation iteratively
xi,jnew = (xi-1,jold + xi,j-1old + xi+1,jold + xi,j+1old ) / 4
Implicit iteration methods. Successive Over Relaxation (SOR) refers to this class. The algorithm performs the following calculation iteratively
xi,jnew = ( w / 4 ) * (xi-1,jnew + xi,j-1new + xi+1,jold + xi,j+1old ) + (1-w) * xi,jold
By using «red-black» coloring of variables each step of SOR consists of two half Jacobi steps. One processes «red»variables and the other processes «black» variables. Coloring of variables allows to overlap calculation and communication.
Example 1. Gauss elimination algorithm
PROGRAM GAUSS
C Solving linear equation system A x = b
PARAMETER ( N = 100 )
REAL A( N, N+1 ), X( N )
C A : Coefficient matrix with dimension (N,N+1).
C Right hand side vector of linear equations is stored
C into last column (N+1)-th, of matrix A
C X : Unknown vector
C N : Number of linear equations
CDVM$ DISTRIBUTE A ( BLOCK, *)
CDVM$ ALIGN X(I) WITH A(I, N+1)
C
C Initialization
C
*DVM$ PARALLEL ( I ) ON A( I , * )
DO 100 I = 1, N
DO 100 J = 1, N+1
IF (( I .EQ. J ) THEN
A( I, J ) = 2.0
ELSE
IF ( J .EQ. N+1) THEN
A( I, J ) = 0.0
ENDIF
ENDIF
100 CONTINUE
C
C Elimination
C
DO 1 I = 1, N
C the I-th row of array A will be buffered before
C execution of I-th iteration, and references A(I,K), A(I, I)
C will be replaced with corresponding reference to buffer
*DVM$ PARALLEL ( J ) ON A( J, * ) , REMOTE_ACCESS (A ( I, : ))
DO 5 J = I+1, N
DO 5 K = I+1, N+1
A( J, K ) = A( J, K ) - A( J, I ) * A( I, K ) / A( I, I )
5 CONTINUE
1 CONTINUE
C First calculate X(N)
X( N ) = A( N, N+1 ) / A( N, N )
C
C Solve X(N-1), X(N-2), ...,X(1) by backward substitution
C
DO 6 J = N-1, 1, -1
C the (J+1)-th elements of array X will be buffered before
C execution of J-th iteration, and reference X(J+1)
C will be replaced with reference to temporal variable
*DVM$ PARALLEL ( I ) ON A( I , * ) , REMOTE_ACCESS ( X( J+1 ))
DO 7 I = 1, J
A( I, N+1 ) = A( I, N+1 ) - A( I, J+1 ) * X( J+1 )
7 CONTINUE
X( J ) = A( J, N+1 ) / A( J, J)
6 CONTINUE
PRINT *, X
END
Example 2. Jacobi algorithm
PROGRAM JACOB
PARAMETER (K=8, ITMAX=20)
REAL A(K,K), B(K,K), EPS, MAXEPS
CDVM$ DISTRIBUTE A ( BLOCK, BLOCK)
CDVM$ ALIGN B( I, J ) WITH A( I, J )
C arrays A and B with block distribution
PRINT *, '********** TEST_JACOBI **********'
MAXEPS = 0.5E - 7
CDVM$ PARALLEL (J,I) ON A(I, J)
C nest of two parallel loops, iteration (i,j) will be executed on
C processor, which is owner of element A(i,j)
DO 1 J = 1, K
DO 1 I = 1, K
A(I, J) = 0.
IF(I.EQ.1 .OR. J.EQ.1 .OR. I.EQ.K .OR. J.EQ.K) THEN
B(I, J) = 0.
ELSE
B(I, J) = 1. + I + J
ENDIF
1 CONTINUE
DO 2 IT = 1, ITMAX
EPS = 0.
CDVM$ PARALLEL (J, I) ON A(I, J), REDUCTION ( MAX( EPS ))
C variable EPS is used for calculation of maximum value
DO 21 J = 2, K-1
DO 21 I = 2, K-1
EPS = MAX ( EPS, ABS( B( I, J) - A( I, J)))
A(I, J) = B(I, J)
21 CONTINUE
CDVM$ PARALLEL (J, I) ON B(I, J), SHADOW_RENEW (A)
C copying shadow elements of array A from
C neighboring processors before loop execution
DO 22 J = 2, K-1
DO 22 I = 2, K-1
B(I, J) = (A( I-1, J ) + A( I, J-1 ) + A( I+1, J) + A( I, J+1 )) / 4
22 CONTINUE
PRINT *, 'IT = ', IT, ' EPS = ', EPS
IF ( EPS . LT . MAXEPS ) GO TO 3















