cdvmLDe (1158334), страница 3
Текст из файла (страница 3)
Here:
elem_type | - element type, |
arrid | - name (identifier) of the array, |
dimi | - size of i-th dimension of the array. |
-
Array destruction is performed using free(arrid) statement.
-
Reference to the array element in the program text is denoted as arrid[ind1]...[indn].
The C-DVM compiler translates these constructions to correct descriptions and Run-Time Library calls. But to translate the program as sequential by ordinary C compiler, the sizes of all the array dimensions dimi (except the first one) must be constants. It is proposed to redefine temporarily them as constants (#define dimi const …#undef dimi). The converter removes these redefinitions, more precisely, moves them to the file beginning. Of course, actual ("calculated" in variables) dimensions in case of sequential execution must be equal to selected constant values (see example of Red-black SOR program in Appendix 1).
Example of declaration.
DVM(DISTRIBUTE [BLOCK][BLOCK]) float (*C)[N];
/* two-dimensional distributed array */
4.2.3Simulation of multidimensional arrays via one-dimensional ones
To operate with multidimensional dynamic arrays in C language other method is used. Memory area (one-dimensional) of required size is created and multidimensional array is simulated by macros arrid(ind1,...,indn) similar to Fortran notation. The macros calculates position of element with specified indexes in the memory area, assigned to the array.
C-DVM converter supports this method also. The array is
-
described as pointer (to scalars);
-
created by arrid=malloc(dim1*dim2...*sizeof(elem_type)) statement,
-
deleted by free(arrid) statement,
-
references to array elements in form arrid(ind1,...,indn) are recognized by the converter and replaced by corresponding construction of RTL DVM.
Example of array declaration (one-dimensional for sequential execution and two-dimensional for parallel execution) and references to its elements:
int Cdim2; /* The second dimension of array C */
DVM (DISTRIBUTE [BLOCK][]) float *C;
#define C(x,y) C[(x)*Cdim2+(y)]
. . .
C(i,j) = (C(i,j-1)+C(i,j+1))/2;
4.3Distributing by aligning
Aligning array A with distributed array B brings in accordance to each element of A an element or a section of array B. When distributing array B the array A will be distributed simultaneously. If element of B is mapped on the processor, then the element of A, corresponding to element B via alignment, will be also mapped on the same processor.
Method of mapping via alignment performs the following two functions.
-
The same distribution of the arrays of the same shape on one processor arrangement does not always guarantee allocation of corresponding elements on the same processor. It forces to specify remote access (see section 6) where it is possible not exist. Only alignment of corresponding elements of the arrays guarantees their allocation on the same processor.
-
Several arrays can be aligned with the same array. Redistribution of one array by REDISTRIBUTE directive will cause corresponding redistribution of the array group.
4.3.1ALIGN and REALIGN directives
The following directives describe array aligning:
align-directive | ::= ALIGN [ align-directive-stuff ] | |
realign-directive | ::= REALIGN alignee align-directive-stuff | |
align-directive-stuff | ::= align-source... align-with-clause | |
alignee | ::= array-name | |
align-source | ::= [] | |
| [ align-dummy ] | ||
align-dummy | ::= scalar-int-variable | |
align-with-clause | ::= WITH align-spec | |
align-spec | ::= align-target [ align-subscript ]… | |
align-target | ::= array-name | |
| template-name | ||
align-subscript | ::= [ int-expr ] | |
| [ align-subscript-use ] | ||
| [] | ||
align-subscript-use | ::= [ primary-expr * ] align-dummy [ add-op primary-expr ] | |
primary-expr | ::= int-constant | |
| int-variable | ||
| ( int-expr ) | ||
add-op | ::= + | |
| - |
Constraints:
-
A length of the align-source… must be equal to the rank of aligned array.
-
A length of the align-subscript… must be equal to the rank of base array align-target.
-
align-directive-stuff can be omitted in ALIGN directive only. In this case distributed array can be used only after execution of REALIGN directive.
Let the alignment of two arrays is specified by the directive
DVM(ALIGN [d1]…[dn] WITH B[ard1]…[ardm]) float A[100]…[100];
where:
di - specification of i-th dimension of aligned arrayA,
ardj - specification of j-th dimension of base array B.
If di is specified by integer variable I, one and only one dimension of array B, specified by linear function ardj = a*I + b must exist. Therefore, the number of array A dimensions, specified by identifiers (align-dummy) must be equal to the number of array B dimensions, specified by the linear function.
Let i-th dimension of array A has the size Ai, and j-th dimension of array B, specified by linear function a*I + b, has the size Bj. Since the parameter I is defined on the value set 0…Ai-1, the following conditions must be satisfied:
b 0; a*(Ai –1)+ b < Bj
If di is empty, the i-th dimension of array A will be local on each processor independently from array B distribution (it is analogue of local dimension in DISTRIBUTE directive).
If ardi is empty, the array A will be replicated along the j-th dimension of the array B (it is analogue of partial replication on the processor arrangement).
If ardi = int-expr, the array A is aligned with the section of the array B.
Example 4.7. Aligning arrays
DVM(DISTRIBUTE [BLOCK][BLOCK]) float B[10][10];
DVM(DISTRIBUTE [BLOCK]) float D[20];
/* aligning the vector with the first line of B) */
DVM(ALIGN [i] WITH B[1][i] ) float A[10];
/* replication of the vector aligning it with each line */
DVM(ALIGN [i] WITH B[][i]) float F[10];
/* collaps: each matrix column corresponds to the vector element */
DVM(ALIGN [][i] WITH D[i]) float C[20][20];
/* alignment using stretching */
DVM(ALIGN [i] WITH D[2*i]) float E[10];
/* alignment using rotation and stretching */
DVM(ALIGN [i][j] WITH C[2*j][2*i] ) float H[10,10];
Let the sequence of alignments A f1 B f2 C, be specified; f2 be the alignment of the array B with the array C, and f1 be the alignment of the array A with the array B. The arrays A and B are considered as aligned with array C by definition. The array B is aligned by function f2 directly and array A is aligned by composite function f1(f2) indirectly. Therefore applying REALIGN directive to the array B does not cause redistribution of array A.
Generally a set of ALIGN specifications is a set of trees. At that every tree root must be distributed by DISTRIBUTE or REDISTRIBUTE directives. When REDISTRIBUTE directive is executed, the whole alignment tree is redistributed.
4.3.2TEMPLATE and CREATE_TEMPLATE directives
If values of linear function a*I + b are beyond base array dimension, it is necessary to define a dummy array - referred to as an alignment template – using the following directive.
template-directive | ::= TEMPLATE [ size ] |
Then it is necessary to align both arrays with the template. The template is defined using DISTRIBUTE and REDISTRIBUTE directives. The template elements don't require real memory. They specify the processor the elements of aligned arrays must be mapped on.
Consider the following example.
Example 4.8. Aligning with template.
DVM(DISTRIBUTE [BLOCK]; TEMPLATE[102]) void * TABC;
DVM(ALIGN B[i] WITH TABC[i]) float B[100];
DVM(ALIGN A[i] WITH TABC[i+1]) float A[100];
DVM(ALIGN C[i] WITH TABC[i+2]) float C[100];
. . .
DO(i ,1, 98,1)
A[i] = C[i-1] + B[i+1];
If the sizes of template (and of arrays) are unknown statically the following executable directive should be used:
create-template-directive | ::= CREATE_TEMPLATE template_name size… |
Example 4.9. Aligning dynamic arrays with template.
DVM(DISTRIBUTE [BLOCK]; TEMPLATE ) void *TABC;
DVM(ALIGN B[i] WITH TABC[i]) float *B;
DVM(ALIGN A[i] WITH TABC[i+1]) float *A;
DVM(ALIGN C[i] WITH TABC[i+2]) float *C;
int N;
. . .
N=…;
DVM(CREATE_TEMPLATE TABC[N]);
A=malloc(N*sizeof(float));
B=malloc(N*sizeof(float));
C=malloc(N*sizeof(float));
. . .
DO(i ,1, N-2,1)
A[i] = C[i-1] + B[i+1];
4.4Undistributed data
If DISTRIBUTE or ALIGN directive is not specified for the data, then the data are allocated on each processor (full replication). The same distribution can be specified by DISTRIBUTE directive with ‘[]’ format for each dimension. But in this case access to the data will be less effective.
5Distribution of computations
5.1Parallel loops
5.1.1Parallel loop definition
The execution model of C-DVM program as the programs in other languages with data parallelism is SPMD (single program, multiple data). All the processors are loaded by the same program, but each processor performs only those assignment statements that modify values of the variables located on this processor (own variables) according to the rule of own computations.
Thus the computations are distributed in accordance with data mapping (data parallelism). In the case of a replicated variable, the assignment statement is performed at all the processors. In the case of a distributed array, the assignment statement is performed only at the processor (or the processors) where corresponding array element is located.
Identification of "own" statements and missing "others" can cause essential overhead during the program execution. Therefore a specification of distributed computations is permitted only for loops, satisfying the following requirements.
-
the loop is tightly nested one with rectangular index space (the exeption see in Example 4 in Appendix 1);
-
distributed dimensions of the arrays are indexed only by linear expressions of the form a*I + b , where I - is loop index.
-
left sides of assignment statements of one loop iteration are allocated at the same processor and therefore the loop iteration is executed on the processor entirelly.
-
there are no data dependencies except reduction dependence and regular dependence along distributed dimensions;
-
left side of an assignment statement is a referense to distributed array, reduction variable or private variable (see section 5.1.3);
-
the loop doesn't contain I/O statements and DVM-directives.
A loop, satisfying these requirements is named parallel loop. An iteration variable of sequential loop, surrounding the parallel loop, or nested in the loop, can indexise local (replicated) dimensions of distributed arrays only.
5.1.2Distribution of loop iterations. PARALLEL directive
A parallel loop is specified by the following directive:
parallel-directive | ::= PARALLEL loop-variable... |
iteration-align-spec | ::= align-target iteration-align-subscript-string |
iteration-align-subscript | ::= [ int-expr ] |
| [ do-variable-use ] | |
| [] | |
loop-variable | ::= [ do-variable ] |
do-variable-use | ::= [ primary-expr * ] do-variable [ add-op primary-expr ] |
Note. Headers of nested loop should be written with macros DO(var,first,last,step) and FOR(var,times) (shorthand for DO(var,0,times-1,1)).