fdvmLDe (1158336), страница 5
Текст из файла (страница 5)
INTEGER PX, PY, DESC(2)
CDVM$ ALIGN PY(I,J ) WITH PX(I,J)
CDVM$ DISTRIBUTE PX (BLOCK,BLOCK)
. . .
PX = ALLOCATE(DESC, ...)
PY = ALLOCATE(DESC, ...)
. . .
CDVM$ REDISTRIBUTE PX (BLOCK,*)
Let sequence of alignments by ALIGN directives is specified
P1 f1 P2 f2 . . . fN-1 PN
where fi is aligning function,
Pi is a pointer to dynamic array.
Then the order of dynamic array allocation (ALLOCATE function execution) must be reverse, i.e.:
PN = ALLOCATE(...)
. . .
P2 = ALLOCATE(...)
P1 = ALLOCATE(...)
If dynamic array pointer is an element of a pointer array, the dynamic array can be aligned by REALIGN directive only. As only a reference to pointer name is allowed in REALIGN directive, the element of pointer array should be assigned previously to scalar variable-pointer. The array with pointer PT(I) can be aligned with the array with pointer PT(J) by the following statements sequence:
P1 = PT(I)
P2 = PT(J)
CDVM$ REALIGN P1(I,J) WITH P2(I+1,J)
4.4DYNAMIC and NEW_VALUE directives
The arrays, redistributed by REDISTRIBUTE and REALIGN directives, should be specified in DYNAMIC directive.
dynamic-directive | is DYNAMIC alignee-or-distributee-list |
alignee-or-distributee | is alignee |
or distributee |
If after REDISTRIBUTE and REALIGN directive execution new values will be assigned to the arrays, additional (optimizing) directive NEW_VALUE must precede these directives.
new-value-directive | is NEW_VALUE |
The directive cancels reassigning the redistributed array values.
If the array is specified in DYNAMIC directive and there is no DISTRIBUTE or ALIGN specification for it, its distribution is postponed up to the first REDISTRIBUTE or REALIGN statement. It is neccessary in two cases.
-
distribution (alignment) of dynamic array with pointer, being a pointer array element;
-
array distribution on processor arrangement section, which parameters are defined during computation.
4.5Default distribution
If the data are not specified in DISTRIBUTE or ALIGN directive , they are distributed on each processor (full replication). The same distribution can be defined by DISTRIBUTE directive with format of * for each dimension. But in that case the access to the data will be less effective.
5Distribution of computations
5.1Parallel loops
5.1.1Parallel loop definition
The execution model of FDVM program and the programs in other data parallel languages too is SPMD (single program, multiple data). All the processors are loaded by the same program, but each processor according to owner-computes rule performs only those assignment statements that modify the variables located on the processor (own variables).
Thus computations are distributed in accordance with data mapping (data parallelism). In the case of a replicated variable, the assignment statement is performed at all the processors. In the case of the distributed array, the assignment statement is performed only at the processor (or processors) where the corresponding array element is located.
Identification of "own" statements and missing "others" can cause essential overhead when executing a program. Therefore the specification of computation distribution is allowed only for loops, satisfying the following requirements:
-
the loop is of tightly nested loop with rectangular index space;
-
distributed dimensions of arrays are indexed only by regular expressions of the form a*I + b , where I - is loop index;
-
left sides of assignment statements of one loop iteration are allocated at the same processor and, therefore, the loop iteration is executed on the processor entirely.
-
there is no data dependencies except reduction dependence and regular dependence along distributed dimensions;
-
left side of assignment statement is a reference to distributed array, reduction variable or private variable (see section 5.1.3);
-
there are no I/O statements and DVM directives inside parallel loop body.
A loop, satisfying these requirements, will be called parallel loop. An iteration variable of sequential loop, surrounding parallel loop or nested in the loop, can index the local (replicated) dimensions of the arrays only.
5.1.2Distribution of loop iterations. PARALLEL directive
Parallel loop is specified by the following directive:
parallel-directive | is PARALLEL ( do-variable-list ) |
iteration-align-spec | is align-target ( iteration-align-subscript-list ) |
iteration-align-subscript | is int-expr |
or do-variable-use | |
or * | |
do-variable-use | is [ primary-expr * ] do-variable [ add-op primary-expr ] |
PARALLEL directive is placed before loop header and distributes loop iterations in accordance with array or template distribution. The directive semantics is similar to semantics of ALIGN directive, where index space of distributed array is replaced by loop index space. The order of loop indexes in list do-variable-list corresponds to the order of corresponding DO statements in tightly nested loop.
The syntax and semantics of directive clauses are described in the following sections:
new-clause | section 5.1.3 |
reduction-clause | section 5.1.4 |
shadow-renew-clause | section 6.2.2 |
shadow-compute-clause | section 6.2.3 |
across-clause | section 6.2.4 |
remote-access-clause | section 6.3.1 |
Example 5.1. Distribution of loop iterations with regular computations.
REAL A(N,M), B(N,M+1), C(N,M), D(N,M)
CDVM$ ALIGN (I,J) WITH B(I,J+1) :: A, C, D
CDVM$ DISTRIBUTE B (BLOCK,BLOCK)
. . .
CDVM$ PARALLEL (I,J) ON B(I,J+1)
DO 10 I = 1, N
DO 10 J = 1, M-1
A(I,J) = D(I,J) + C(I,J)
B(I,J+1) = D(I,J) - C(I,J)
10 CONTINUE
The loop satisfies to all requirements of a parallel loop. In particular, left sides of assignment statements of one loop iteration A(I,J) and B(I,J+1) are allocated on the same processor through alignment of arrays A and B.
If left sides of assignment operators are located on the different processors (distributed iteration of the loop) then the loop must be split on several loops.
Example 5.2. Splitting the loop
CDVM$ PARALLEL ( I ) ON A( 2*I ) | |
DO 10 I = 1, N | DO 10 I = 1, N |
DO 10 J = 1, M-1 | 10 A(2*I) = . . . |
A(2*I) = . . . | CDVM$ PARALLEL ( I ) ON B( 3*I ) |
B(3*I) = . . . | DO 11 I = 1, N |
10 CONTINUE | 11 B(3*I) = . . . |
The loop is split on 2 loops, and each of them satisfies to requirements of parallel loop.
5.1.3Private variables. NEW clause
If a variable usage is localized in one loop iteration, then it must be specified in NEW clause:
new-clause | is NEW ( new-variable-list ) |
new-variable | is array-name |
or scalar-variable-name |
The distributed arrays cannot be used as NEW-variables (private variables). The value of the private variable is undefined at the beginning of loop iteration and not used after loop iteration; therefore own copy of private variable can be used in each loop iteration.
Example 5.3. Specification of private variable.
CDVM$ PARALLEL (I,J) ON A(I,J) , NEW ( X )
DO 10 I = 1, N
DO 10 J = 1, N
X = B(I,J) + C(I,J)
A(I,J) = X
10 CONTINUE
5.1.4Reduction operations and variables. REDUCTION clause
Programs often contain loops with so called reduction operations: array elements are accumulated in some variable, minimum or maximum value of them is determined. Iterations of such loop may be distributed also, if to use the REDUCTION clause.
reduction-clause | is REDUCTION |
reduction-op | is reduction-op-name ( reduction-variable ) | |
or reduction-loc-name ( reduction-variable , | ||
location-variable | is array-name |
reduction-variable | is array-name |
or scalar-variable-name |
reduction-op-name | is SUM |
or PRODUCT | |
or MAX | |
or MIN | |
or AND | |
or OR | |
or EQV | |
or NEQV |
reduction-loc-name | is MAXLOC |
or MINLOC |
Distributed arrays may not be used as reduction variables. Reduction variables are calculated and used only inside the loop in statements of a certain type: the reduction statements.
Let us introduce some notation.
rv | - a reduction variable |
L | - one-dimensional integer array |
n | - the number of minimum or maximum coordinates |
er | - an expression that does not contain rv; |
Ik | - integer variable |
op | - one of the following Fortran operations: +, -,.OR.,.AND.,.EQV.,.NEQV. |
ol | - one of the following Fortran operations: .GE.,.GT.,.LE.,.LT. |
f | - MAX or MIN function |
Reduction statement in the loop body is the statement of one of the following forms:
1) rv = rv op er
rv = er op rv
2) rv = f( rv, er )
rv = f( er, rv )
3) if( rv ol er ) rv = er
if( er ol rv ) rv = er
4) if( rv ol er ) then