fdvmPDe (1158422), страница 2
Текст из файла (страница 2)
* creating abstract machine representation
iamv = crtamv(am,1,M,...)
* inquiring references to elements of abstract machine representation
DO lab idvm00 = 1,M
lab TA(2,idvm00) = getamr(iamv,idvm00-1)
4.2.4REMOTE_GROUP and REDUCTION_GROUP directives
Appearance of a name in an REMOTE_GROUP directive declares that to be an remote references group name. The directive
*DVM$ REMOTE_GROUP RMG
is replaced by the statement
INTEGER RMG(3)
Also, the following statements for initializing integer array RMG are generated and inserted in the beginning of executable part of procedure if PREFETCH directive occurs in this procedure:
RMG(1) = 0
RMG(2) = 0
RMG(3) = 0
Appearance of a name in an REDUCTION_GROUP directive declares that to be an reduction group name. The directive
*DVM$ REDUCTION_GROUP REDG
is replaced by the statement
INTEGER REDG
The following statement for assigning null value to integer variable REDG is generated and inserted before first executable statement of procedure:
REDG = 0
4.3Translating executable directives and statements
4.3.1PARALLEL directive
The regular parallel loop:
*DVM$ PARALLEL (I1, ..., In) ON A(…)...
DO label I1 = ...
. . .
DO label In = ...
loop-body
label CONTINUE
is translated into
[ ACROSS-block-1 ]
[ REDUCTION-block-1 ]
* creating parallel loop
ipl = crtpl(n)
[ SHADOW-RENEW-block-1 ]
[ SHADOW-START-block ]
[ SHADOW-WAIT-block ]
* mapping parallel loop
it = mappl(ipl,A,...)
[ SHADOW-RENEW-block-2 ]
[ REDUCTION-block-2 ]
[ REMOTE-ACCESS-block ]
* inquiry of continuation of parallel loop execution
lab1 if(dopl(ipl) .eq. 0) go to lab2
DO label I1 = ...
. . .
DO label In = ...
loop-body
label CONTINUE
go to lab1
* terminating parallel loop
lab2 it = endpl(ipl)
[ ACROSS-block-2 ]
[ REDUCTION-block-3 ]
If the REDUCTION clause appears in a PARALLEL directive, the REDUCTION‑block‑1, REDUCTION-block-2, and REDUCTION-block-3 are generated.
REDUCTION‑block‑1:
Case of synchronius REDUCTION specification:
* creating reduction group
irg = crtrg(1,1)
Case of asynchronius REDUCTION specification (with group name RDG):
if(RDG.EQ.0) THEN
* creating reduction group
RDG = crtrg(1,1)
END IF
{
* creating reduction
irv = crtrdf(reduction-function, reduction-var,...)
}... for each reduction in reduction-list
REDUCTION‑block‑2:
{
* including reduction in reduction group
it = insred(irg,irv,ipl,0)
}... for each reduction in reduction-list
REDUCTION‑block‑3:
* starting reduction group
it = strtrd(irg)
* waiting for completion of reduction group
it = waitrd(irg)
* deleting reduction group
it = delobj(irg)
If the SHADOW_RENEW clause appears in a PARALLEL directive, the SHADOW‑RENEW‑block‑1 and SHADOW‑RENEW-block-2 are inserted.
SHADOW‑RENEW‑block‑1:
* creating shadow edge group
ishg = crtshg(...)
{
* including shadow edge in the group
it = inssh(ishg,array-header,...)
}... for each array in renewee-list
* starting shadow edge group renewing
it = strtsh(ishg)
SHADOW‑RENEW‑block‑2:
* waiting for completion of shadow edge group renewing
it = waitsh(ishg)
The SHADOW_START and SHADOW_WAIT specification causes reordering parallel loop execution.
SHADOW‑START‑block:
it = exfrst(shadow-group-name)
The function exfrst( ) sets the following order of the parallel loop iterations:
-
Exported elements (original elements) of the local parts of the distributed arrays have been computed
-
The shadow edge group renew have been started;
-
Internal elements of the local parts of the distributed arrays have been computed.
SHADOW‑WAIT‑block:
it = imlast(shadow-group-name)
The function imlast( ) sets the following order of the parallel loop iterations:
-
Internal points of the local part of the distributed arrays have been computed;
-
Run-time system await the completion of the shadow edge renewing.
-
The exported elements (original elements) of the local part of the distributed arrays have been computed;
If the ACROSS clause appears in a PARALLEL directive, the ACROSS‑block‑1 and ACROSS-block-2 are inserted.
ACROSS‑block‑1:
* creating shadow edge group
ishg = crtshg(0)
{
* including shadow edge(for anti-dependences) in the group
it = inssh(ishg,array-header,…)
}... for each array in dependent-array-list
* starting shadow edge group renewing
it = strtsh(ishg)
* waiting for completion of shadow edge group renewing
it = waitsh(ishg)
* deleting shadow edge group
it = delobj(ishg)
. . .
* creating shadow edge group
ishg = crtshg(0)
{
* including shadow edge(for flow-dependences) in the group
it = insshd(ishg,array-header,…)
}... for each array in dependent-array-list
* initializing import of shadow edges
it = recvsh(ishg)
* waiting for completion of shadow edge group renewing
it = waitsh(ishg)
ACROSS‑block‑2:
* initializing export of shadow edges
it = sendsh(ishg)
* waiting for completion of shadow edge group renewing
it = waitsh(ishg)
* deleting shadow edge group
it = delobj(ishg)
The REMOTE-ACCESS-block is generated if the REMOTE_ACCESS clause occurs in PARALLEL directive.
REMOTE-ACCESS-block:
Case of synchronous REMOTE_ACCESS specification:
{
* creating buffer array
it = crtrbl(array-header,buffer-header,…)
* starting load of buffer array
it = loadrb(buffer-header,0)
* waiting for completion of loading buffer array
it = waitrb(buffer-header)
* correcting coefficient CNB of buffer array elements addressing,
* where NB is rank of buffer array
buffer-header(NB+2) = buffer-header(NB+1)-
* buffer-header(NB)*buffer-header(NB+3) … -
* buffer-header(3)*buffer-header(2*NB+2)
}... for each remote-access reference
Case of asynchronous REMOTE_ACCESS specification (with group RMG):
IF (RMG(2) .EQ. 0) THEN
{
* creating buffer array
it = crtrbl(array-header,buffer-header,…)
* correcting coefficient CNB of buffer array elements addressing
buffer-header(NB+2) = buffer-header(NB+1)-
* buffer-header(NB)*buffer-header(NB+3) … -
* buffer-header(3)*buffer-header(2*NB+2)
* starting load of buffer array
it = loadrb(buffer-header,0)
* waiting for completion of loading buffer array
it = waitrb(buffer-header)
* including buffer array in group RMG
it = insrb(RMG(1),buffer-header)
}... for each remote-access reference
ELSE
IF (RMG(3) .EQ. 1) THEN
* waiting for completion of loading all the buffer arrays of group
it = waitbg(RMG(1))
RMG(3) = 0
ENDIF
ENDIF
4.3.2PREFETCH and RESET directives
The directive
*DVM$ PREFETCH RMG
translated into the folloing sequence of statements:
IF (RMG(1) .EQ. 0) THEN
RMG(1) = crtbg(0,1)
RMG(2) = 0
ELSE
it = loadbg(RMG(1),1)
RMG(2) = 1
RMG(3) = 1
ENDIF
The directive
*DVM$ RESET RMG
replaced by the statement
it = delobj(RMG(1))
4.3.3MAP directive
The MAP directive indicates that the task must be mapped onto the processor arrangement section. The directive
*DVM$ MAP TA(J) ONTO P(l1:u1,…,lr:ur)
is replaced by
* creating processor subsystem
TA(2,J) = crtps(P,l,u,0)
* mapping abstact machine
it = mapam(TA(2,J),TA(1,J))
where l – vector (lr – 1, …, l1 – 1),
u – vector (ur – 1, …, u1 – 1).
4.3.4TASK_REGION construct
The TASK_REGION construct
*DVM$ TASK_REGION TA
*DVM$ ON TA(1)
block-1
*DVM$ END ON
. . .
*DVM$ ON TA(M)
block-M
*DVM$ END ON
*DVM$ END TASK_REGION
is translated into
IF (runam(TA(2,1)).EQ.0) GO TO lab1
block-1
it = stopam()
lab1 CONTINUE
. . .
labM-1 IF (runam(TA(2,M)).EQ.0) GO TO labM
block-M
it = stopam()
labM CONTINUE
4.3.5Parallel-task-loop construct
The parallel-task-loop construct
*DVM$ TASK_REGION TA
*DVM$ PARALLEL (I) ON TA(I)
DO label I = ...
loop-body
label CONTINUE
*DVM$ END TASK_REGION
is translated into
DO label I = ...
IF (runam(TA(2,I)).EQ.0) GO TO label
loop-body
it = stopam()
label CONTINUE
4.3.6The other FDVM directives
The REDUCTION_START directive is replaced by the statement:
* starting reduction group
it = strtrd(reduction-group-var)
The REDUCTION_WAIT directive is replaced by the statements:
* waiting for completion of reduction group
it = waitrd(reduction-group-var)
* deleting reduction group
it = delobj(reduction-group-var)
The SHADOW_GROUP directive is translated into the following code:
* creating shadow edge group
shadow-group-var = crtshg(...)
{
* including shadow edge in the group
it = inssh(shadow-group-var,array-header,...)
}... for each array in renewee-list
The SHADOW_START directive is replaced by the statement:
* starting shadow edge group renewing
it = strtsh(shadow-group-var)
The SHADOW_WAIT directive is replaced by the statement:
* waiting for completion of shadow edge group renewing
it = waitsh(shadow-group-var)
The NEW_VALUE directive affects translating of the next directive (REDISTRIBUTE or REALIGN) and doesn’t require generating new statements. The REDISTRIBUTE and REALIGN directives are implemented accordingly by the redis( ) and realn( ) function. The NewSign flag set to 1 for variables, listed in NEW_VALUE directive.
4.3.7Debug directives
The DEBUG and ENDDEBUG directives are not executable directives and don’t require generating new statements. They are defined the fragment of program the user would like to get the information about program execution for. These directives are causes resetting compilation mode. The compilation mode depends on debug level, that may be specified for each fragment in compiler run command.
The TRACE ON (TRACE OFF) directive sets on (sets off) tracing of program execution and implemented by tron( )(troff( )) function of Lib-DVM.
When debugging mode of compilation is specified by user (option –d), FDVM compiler processes beginning and end of parallel and sequential loops and generates the Debugger function calls: dbegpl( ), dbegsl( ), and dendl( ). Also, the compiler must provide tracing of data accesses, loop iteration starting, and task starting. The corresponding Debugger function calls are inserted in program.
The INTERVAL and END INTERVAL directives intend for description of intervals of the program execution, for which the user would like to get the performance characteristics. The compiler inserts Performance Analizer calls at the beginning and the end of the interval:
call binter(...)
. . .
call einter(...)
When performance analyzing mode of compilation is specified by user (option –e), FDVM compiler processes beginning and end of parallel and sequential loops and generates the Performance Analyzer function calls: bploop( ), bsloop( ), and eloop( ).
4.3.8Input/Output Statements
In DVM model, input, output and other operations with external files are executed by single processor ( I/O processor ), which is determined by run-time system. I/O of a replicated variable deals with variable copy allocated on I/O processor. I/O of a distributed array deals with buffer array allocated on I/O processor. Inputted data are sent to all other processors owing the variables of input list. When the distributed array is outputted, data are transferred into the buffer from other processors owing elements of the array.
The FDVM compiler replaces each I/O statement by logical IF statement:
IF(tstio().NE.0 ) I/O-statement
except statement of I/O to internal file that stays unchanged. The function tstio( ) returns 1, if current processor is I/O processor.
Moreover for READ statement and I/O statements with IOSTAT parameter, compiler generates srmem( ) function calls for sending memory areas of I/O processor to other processor.
In case of I/O of distributed array the memory is allocated in user program for I/O buffer.
Let A(N1,N2,...,Nk) is distributed array of rank k, BUF(L) - vector of the same type as array A. Then the compiler replaces a statement I/O of A with the sequence of statements according to the following scheme:















