FDVM2 (1158347), страница 6
Текст из файла (страница 6)
Example 5.8. Specification of own computations.
PARAMETER (N = 100)
REAL A(N, N+1), X(N)
CHPF$ ALIGN X( I ) WITH A( I, N+1)
CHPF$ DISTRIBUTE ( BLOCK, * ) :: A
. . .
C back substitution of Gauss algorithm
C own computations outside the loops
СDVM$ OWN
X(N) = A(N,N+1) / A(N,N)
DO 10 J = N-1, 1, -1
CDVM$ PARALLEL ( I ) ON A ( I, * )
DO 20 I = 1, J
A(I,N+1) = A(I,N+1) - A(I,J+1) * X(J+1)
20 CONTINUE
C own computations in sequential loop,
C nesting the distributed loop
СDVM$ OWN
X(J) = A(J,N+1) / A(J,J)
10 CONTINUE
Note, that A(J,N+1) and A(J,J) are localized on the processor, where X(J) is allocated.
6.Access to Remote Data
Remote data can be used in the distributed loop or in statements of own computations. Distributed loop iteration is executed exclusively on single processor. If in the loop iteration the values of array elements, allocated on the other processors, are used, then these values are called remote data. The references to such elements are called remote references. The similar notions are defined for own computation statement. It is necessary to distinguish two kinds of remote references:
-
remote regular references - distributed dimensions are indexed by expressions of the form a*I + b;
-
remote irregular references - distributed dimensions are indexed by elements of matrix ME(I,J).
Note, that local (non-distributed) array dimension indexing can be arbitrary.
The access to remote data requires the following operations performing
-
buffer allocation,
-
replacement of references to the array by references to the buffer,
-
organization of data exchange between processors to fill the buffers.
Low level models of message passing (like MPI) requires "manual" programming of all these operations. Parallel systems (like HPF) assume automatic execution of the operations. FDVM supposes compromise model: high level model of specifications of remote data and group exchange operations and a possibility for a user to optimize remote data access.
6.1. Regular Remote References
6.1.1. Shadow Edges Group. SHADOW_RENEW Clause
Remote data access may be organized using shadow edges of the local array section, if the indexes of remote references has the form I d , where d is a constant.
Consider the following example
REAL A(100), B(100)
DO 10 I = 2, 98
A(I) = (B(I-1) + B(I+1) + B(I+2) ) / 3
10 CONTINUE
In this example it is impossible to align the arrays in such a way, that all elements, used in I-th iteration, were allocated on the same processor. Consider distribution of array B with shadow edges on the three neighbor processors.
| P-1 | P | P+1 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| V | V | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| | | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Fig.6.1. Distribution of array with shadow edges.
Two buffers, that are continuous prolongation of the array local section, are distributed on each processor. The width of shadow edge at the low endis equal to 1 (for B(I-1)), the width of shadow edge at the high end is equal to 2 ( for B(I+1) and B(I+2)). If before loop entering to perform processor exchange according to scheme on fig. 6.1, then the loop can be executed on each processor without replacing references to arrays by references to the buffer.
To specify remote access through shadow edges FDVM provides the following directives.
Description of maximal size of shadow edges.
| shadow-directive | is SHADOW shadow-array-list |
| shadow-array | is array-name ( shadow-edge-list ) |
| shadow-edge | is width |
| or low-width : high-width |
| width | is int-expr |
| low-width | is int-expr |
| high-width | is int-expr |
Constraint. The int-expr representing a width, low-width, or high-width must be a constant specification expression with value greater than or equal to 0.
A shadow-edge specification of width is equivalent to a shadow-edge of width : width.
By default distributed array has a shadow width of 1 on both sides of each distributed dimension.
If it is needed to update shadow edges prior to a distributed loop execution, then SHADOW_RENEW clause is specified in PARALLEL directive.
| shadow-renew-clause | is SHADOW_RENEW ( renewee‑list ) |
| renewee | is dist-array-name [ ( shadow-edge-list ) ] [ (CORNER) ] |
Constraints:
-
Width of the shadow edges filled by values must not exceed the maximal width specified initially in the SHADOW directive.
-
If shadow edge widths is not specified, then the maximal widths are used.
Example 6.1. Remote access through shadow edges.
REAL A(100), B(100)
CHPF$ ALIGN B( I ) WITH A( I )
CHPF$ DISTRIBUTE ( BLOCK) :: A
CHPF$ SHADOW B( 1:2 )
. . .
CDVM$ PARALLEL ( I ) ON A ( I ), SHADOW_RENEW ( B )
DO 10 I = 2, 98
A(I) = (B(I-1) + B(I+1) + B(I+2) ) / 3
10 CONTINUE
When renewing shadow edges the maximal widths 1:2 specified in directive SHADOW are used.
Shadow edges for multidimensional distributed arrays can be specified for each dimension. A special case is when remote reference points to "a corner" of shadow edges. In that a case it is needed to specify additional parameter CORNER.
Example 6.2. Remote access through "corners" of shadow edges.
REAL A(100,100), B(100,100)
CHPF$ ALIGN B( I, J ) WITH A( I, J )
CHPF$ DISTRIBUTE A ( BLOCK,BLOCK)
. . .
CDVM$ PARALLEL ( I, J ) ON A ( I, J ), SHADOW_RENEW ( B (CORNER))
DO 10 I = 2, 99
DO 10 J = 2, 99
A(I,J) = (B(I,J+1) + B(I+1,J) + B(I+1,J+1) ) / 3
10 CONTINUE
The width of shadow edges of the array B is equal to 1 for both dimensions by default. As "corner" reference B(I+1,J+1) exists, the CORNER parameter is specified.
The other way to access through shadow edges is described in section 6.3.1.
6.1.2. Group of Regular Remote References. REMOTE_ACCESS Directive
If the indexes of remote references have the form a*I b, then these references must be specified in REMOTE_ACCESS directive.
| remote-access-directive | is REMOTE_ACCESS ( [ remote-group-name : ] regular-reference-list) |
| regular-reference | is dist-array-name [( regular-subscript-list )] |
| regular-subscript | is int-expr |
| or do-variable-use | |
| or : | |
| remote-access-clause | is remote-access-directive |
REMOTE_ACCESS directive can appear as a separate directive prior to own computation statement or as a clause in PARALLEL directive.
If remote reference is specified as array name without index list, then all references to the array in a distributed loop (and in own computation statement) are regular remote references.
If symbol : is specified as dimension index regular-subscript then any index along this dimension causes remote access.
A set of references in REMOTE_ACCESS directive without specification of group name (remote-group-name) is called unnamed reference group.
Semantics of the directive performing for unnamed reference group is to read the remote data in the buffer and replace remote references by references to the buffer.
Example 6.3. Using unnamed group of regular remote references.
DIMENSION A(100,100), B(100,100)
CHPF$ DISTRIBUTE (*,BLOCK) :: A
CHPF$ ALIGN B( I, J ) WITH A( I, J )
. . .
*DVM$ REMOTE_ACCESS ( A(50,50) )
X = A(50,50)
. . .
*DVM$ REMOTE_ACCESS ( B(100,100) )
*DVM$ OWN
A(1,1) = B(100,100)
. . .
*DVM$ PARALLEL (I,J) ON A(I,J) , REMOTE_ACCESS ( B(:,N) )
DO 10 I = 1, 100
DO 10 J = 1, 100















