CDVM2 (1158340), страница 6
Текст из файла (страница 6)
PREFETCH directive execution depends on value of the variable, declared as remote reference group. If the variable value is undefined (the group is empty) then the directive execution is postponed up to performing PARALLEL directive with REMOTE_ACCESS specification. If the variable value is defined, advance loading of remote data is done for reference group, assigned to the variable.
Consider the following sequence of directives:
DVM(REMOTE_GROUP) void * RS;
. . .
DVM(PREFETCH RS );
. . .
DVM(PARALLEL . . . ; REMOTE_ACCESS RS : r1)
. . .
DVM(PARALLEL . . . ; REMOTE_ACCESS RS : rn)
. . .
When PREFETCH directive is executed the first time, the value of variable RS is not defined. Therefore remote data reading is postponed up to the specification of remote references subgroup. The “REMOTE_ACCESS RS : ri“ specification is performed in the following way
-
reading remote data of subgroup ri,
-
including subgroup ri references in RS.
When PREFETCH directive is executed the second time, the value of variable RS is defined (union of subgroups ri ... rn). Remote data reading is performed for all references. The reference subgroup “REMOTE_ACCESS RS : ri“ specification is not performed.
Constraints.
-
Repeated usage of PREFETCH directive is correct, if the remote reference group characteristics (the loop parameters, distribution of arrays and the values of index expressions in remote references) are not updated:
-
PREFETCH directive can be executed for the several loops (several REMOTE_ACCESS directives), if there are no data dependencies for distributed arrays between the loops (directives).
If remote reference group characteristics are updated it is necessary to assign to it the undefined value by RESET directive.
RESET directive assigns undefined value to the variable, and then new accumulation of remote reference group will be performed.
Consider the following fragment of multi-block problem. Simulation area is split on 3 blocks as is shown in fig. 6.2.
| M | |||||
| N1 | A1 | ||||
| D | |||||
| N2 | A2 | A3 | |||
| M1 | M2 | ||||
Fig. 6.2. Splitting simulation area.
Example 6.4. Using named group of regular remote references.
DVM ( DISTRIBUTE [BLOCK][BLOCK] )
float A1[M][N1+1], A2[M1+1][[N2+1], A3[M2+1][N2+1];
DVM ( REMOTE_GROUP) void *RS;
DO(ITER,1, MIT,1)
{
. . .
/* edge exchange by split line D */
DVM ( PREFETCH RS);
. . .
DVM ( PARALLEL [i] ON A1[i][N1]; REMOTE_ACCESS RS: A2[i][1])
DO(i,0, M1-1,1) A1[i][N1] = A2[i][1];
DVM ( PARALLEL [i] ON A1[i][N1]; REMOTE_ACCESS RS: A3[i-M1][1] )
DO(i,M1, M-1,1) A1[i][N1] = A3[i-M1][1];
DVM( PARALLEL [i] ON A2 [i][0[; REMOTE_ACCESS RS: A1[I][N1-1])
DO(i,0, M1-1,1) A2[i][0] = A1[i][N1-1];
DVM( PARALLEL [i] ON A3 [i][0]; REMOTE_ACCESS RS: A1[I+M1][N1-1])
DO (i,0, M2-1,1) A3[i][0] = A1[i+M1][N1-1];
. . .
if (NOBLN) {
/* array redistribution to balance loading */
. . .
DVM (RESET RS);
}
. . .
} /*DO ITER*/
6.2. Group of irregular remote references. INDIRECT_ACCESS directive
If distributed dimension has indirect indexing through index matrix, and then remote reference should be specified in INDIRECT_ACCESS directive.
The following directives define group irregular remote references.
Group name description.
| Indirect-group-directive | ::= INDIRECT_GROUP |
The identifier, specified with this directive, can be used in INDIRECT_ACCESS, PREFETCH and RESET directives only.
Specification of irregular remote references.
| indirect-access-directive | ::= INDIRECT_ACCESS [ indirect-group-name : ] indirect-reference-string |
| indirect-reference | ::= dist-array-name [ indirect-subscript-string ] |
| iIndirect-subscript | ::= [ int-expr ] |
| | [ array-name [ do-variable ][ int-expr ] ] | |
| |…[] | |
| Indirect-access-clause | ::= remote-access-directive |
Semantics of INDIRECT_ACCESS directive is similar to semantics of REMOTE_ACCESS directive.
Examples of using unnamed group of irregular remote references are given in section 5.3.2. (Example 5.7).
Usage of named group in INDIRECT_ACCESS directive has the same constraints as in REMOTE_ACCESS directive.
Example 6.5. Using named group of irregular remote references
DVM(DISTRIBUTE [BLOCK ]) float V1[NV];
DVM(ALIGN [I] WITH V1[ I] ) float V2[NV], X[NV], Y[NV];
DVM(ALIGN [I][] WITH V1[I] ) int ME[NV][NE];
DVM(INDIRECT_GROUP) void * IG;
. . .
DVM(PREFETCH IG );
. . .
DVM(PARALLEL [I] ON V1 [I]; INDIRECT_ACCESS IG: V2[ME[I][J]])
FOR(I, NV)
DO(J,1, ME[I][0],1)
V1[E1[I]] = Y[I] + V2[ME[I][J]);
. . .
DVM(PARALLEL [I] ON V2[I]; INDIRECT_ACCESS IG: Y[ME[I][J]])
FOR( I , NV)
DO(J, 0, ME[I][0],1)
{ N = ME[I][J];
X[I] = X[I] + Y[N];
}
. . .
if(RCF)
{ compute( ME );
DVM(RESET IG);
}
. . .
If the values of index matrix are updated, then a value of variable IG is not defined.
6.3. Overlapping computations with data exchange between processors.
6.3.1. Asynchronous updating of shadow edges
Updating values of shadow edges is indivisible (synchronous) exchange operation for unnamed group of distributed arrays. The operation can be divided into two operations:
-
starting exchange
-
waiting for values.
While waiting for shadow edge values, computations can be performed, in particular, computations on internal area of the local array section can be performed.
The following directives describe asynchronous update of shadow edges for named group of distributed arrays.
Declaration of a group.
| shadow-group-directive | ::= CREATE_SHADOW_GROUP shadow-group-name : renewee-list |
Start of shadow edges updating
| shadow-start-directive | ::= SHADOW_START shadow-group-name |
Waiting for shadow edges values.
| shadow-wait-directive | ::= SHADOW_WAIT shadow-group-name |
SHADOW_START directive must be executed after CREATE_SHADOW_GROUP one. After CREATE_SHADOW_GROUP directive execution directives SHADOW_START and SHADOW_WAIT can be executed many times. Updated values of the shadow edges may be used only after SHADOW_WAIT directive.
A special case is using SHADOW_START and SHADOW_WAIT directives in specification shadow-renew-clause of distributed loop.
| shadow-renew-clause | ::= . . . |
| | shadow-start-directive | |
| | shadow-wait-directive |
If SHADOW_START directive is specified in a distributed loop, the surpassing computation of the values, send to the shadow edges. Then the shadow edge updates are performed and computation of internal area of the array local section is done (see fig. 6.3).
If SHADOW_WAIT directives are specified in a distributed loop, the surpassing computation of the values in internal area of the array local section is performed. After completion of waiting for new values of shadow edges the values to be sent to shadow edges are computed.
|
| Shadow edges | |||||||||
|
| Sent values | |||||||||
|
| Internal area | |||||||||
Fig. 6.3. Scheme of array local section with shadow edges.
Example 6.6. Overlapping computations and shadow edges updating.
DVM(DISTRIBUTE [BLOCK][BLOCK]) float C[100][100];
DVM(ALIGN [I][J] WITH C[I][J] ) float A[100][100], B[100][100], D[100][100];
DVM(SHADOW_GROUP) void * AB;
. . .
DVM(CREATE_SHADOW_GROUP AB : A B );
. . .
DVM(SHADOW_START AB);
. . .
DVM(PARALLEL [I][J] ON C [I][J] ; SHADOW_WAIT AB )
DO( I , 1, 98, 1)
DO( J , 1, 98, 1)
{ C[I][J] = (A[I-1][J] + A[I+1][J] + A[I][J-1] + A[I][J+1] ) / 4.;
D[I][J] = (B[I-1][J] + B[I+1][J] + B[I][J-1] + B[I][J+1] ) / 4.;
}
The shadow edge width of distributed arrays is equal to 1 element for each dimension. Since SHADOW_WAIT directive is specified in distributed loop directive, the order of execution of the loop iterations is changed. At first computations on internal area of each local section of the array are performed. Then directive of waiting for updated values of shadow edges is performed. The loop execution is completed by computation of the values send to shadow edges.
6.3.2. Asynchronous group reduction
REDUCTION specification in a distributed loop is uninterrupted (synchronous) operation for unnamed reduction group. This operation can be also divided into two operations and while waiting for reduction results to perform other computations.
The following directives describe asynchronous execution of named group of reduction operations.
Declaration of a group.
| reduction-group-directive | ::= CREATE_REDUCTION_GROUP reduction-group-name : reduction-op-string |
Starting reduction group execution.
| reduction-start-directive | ::= REDUCTION_START reduction-group-name |
Waiting for results of the group reduction
| reduction-wait-directive | ::= REDUCTION_WAIT reduction-group-name |
Constraints.
-
CREATE_REDUCTION_GROUP directive must be executed after assigning initial values to all reduction variables of the group and before executing a loop (loops) where reduction variable values are calculated.
-
After executing CREATE_REDUCTION_GROUP directive, and before executing REDUCTION_START directive, the reduction variables of the group may be used in reduction statements of distributed loops only.
-
REDUCTION_START and REDUCTION_WAIT directives must be executed after the completion of the loop (loops) where the values of the reduction variables are calculated. The only statements allowed between these directives are those that are not using the values of the reduction variables.
-
REDUCTION_WAIT directive deletes the group of the reduction operations.
Example 6.7. Asynchronous execution of reduction group.















