FDVM_PD (1158350)
Текст из файла
11
Fortran-DVM Compiler
Preliminary design
Март 30, 1999
Keldysh Institute of Applied Mathematics
Russia Academy of Sciences
Contents
1 Functions of Compiler 3
2 The Content of Compiler 3
3 Approach and Principle for Compiler Implementation 3
4 Translating FDVM program 3
4.1 Distributed arrays 3
4.2 Translating specification directives 4
4.3 Translating executable directives and statements 6
4.3.1 PARALLEL directive 6
4.3.2 The other FDVM directives 8
4.3.3 Input/Output Statements 10
1 Functions of Compiler
Fortran-DVM (FDVM) is the Fortran 77 language which is extended by special annotations for specifying parallel execution of the program. These annotations are called DVM directives.
The compiler translates the parallel FDVM code into a sequential Fortran 77 code including Lib-DVM library calls. The run-time system Lib-DVM is written in the C language and it uses the features of MPI for providing inter-processor communications.
By user request the FDVM compiler generates an extended code for debugging and performance analyzing. Special mode of compiling is intended to produce “sequential” code ignoring all the DVM directives.
2 The Content of Compiler
Program compiling is divided into three phases.
Program in source language are first parsed, one file at a time to produce a machine independent binary internal format (called a .dep file). Front-end compiling results in building a parse tree of program, a symbol and type table.
Second phase involves analyzing and restructuring internal representation of FDVM program. Each DVM directive is substituted for a sequence of Lib-DVM function calls. The following actions are taken at this phase:
-
generating function call expressions and assignment statements to store function value;
-
creating declaration statements for temporary variables that are used for argument passing, storing function value, buffering I/O, and addressing distributed arrays;
-
linearizing distributed array element references.
Restructuring control graph may be required to insert new statement in a program (carrying or substituting label, replacement of logical IF statement by IF...THEN...ENDIF construct, and so on).
Last phase of compilation is unparsing, that is generating new source F77 code from restructured internal form.
3 Approach and Principle for Compiler Implementation
Sage ++ system is used as a tool for designing FDVM compiler.
The Fortran parser of Sage++ which is based on the GNU Bison version of YACC is extended to add language extensions (DVM directives) to Fortran system.
Back-end routine is written in C++ language using Sage++ class library. It traverses a program file accessing it in lexical order and substitutes each DVM directive for a sequence of Lib-DVM function calls. Unparsing is implemented by File class member function of Sage++ class library.
4 Translating FDVM program
4.1 Distributed arrays
The array with attribute DISTRIBUTE or ALIGN is called distributed array. The memory for the distributed array elements is allocated by Run-Time system. Run‑Time system evaluate the size and allocate the memory on each processor for the local section of distributed array according to it distribution (DISTRIBUTE directive) and for shadow edges are declared in SHADOW directive.
A distributed array is addressed respect to base address declared by means of statements
integer i0000m(0:0)
common /mem000/ i0000m
real r0000m(0:0)
equivalence (i0000m,r0000m)
Coefficients for addressing and offset are evaluated by Run-Time library function (align) and stored in descriptor of distributed array called header. User’s program has to allocate this header in memory. The FDVM compiler removes from user program the declaration of distributed array and inserts in it the declaration of array header as the same named integer vector of 2*N+2 elements, where N is the rank of distributed array. For example, if A is distributed array, the declaration statement
real A(L1:U1,L2:U2,...,LN:UN)
is replaced by statement
integer A(2*N+2)
The FDVM compiler linearizes each distributed array element reference replacing reference
A(I1,I2, ..., IN)
by
N
r0000m(A(N+2) + I1 + A(N-j+2) * Ij )
i=2
The structure of header is given in Fig.1.
| 1 | pointer to system structure | |
| 2 | offset | |
| 3 | C1 | coefficients |
| 4 | C2 | for |
| . . . | array elements | |
| N+2 | CN | addressing |
| N+3 | L1 | lower bounds |
| N+4 | L2 | of |
| . . . | array | |
| 2*N+2 | LN | dimensions |
Fig.1. Array header structure in FDVM.
The first N+2 elements of header are initialized by align( ) function of Run-Time library and renewed by realn( ) and redis( ) functions. Storing lower bounds of dimensions are inserted in user program by FDVM compiler.
4.2 Translating specification directives
The specification directives DISTRIBUTE, ALIGN and TEMPLATE determine a mapping tree of distributed arrays. The FDVM compiler builds the mapping trees analyzing specification directives and then generates the statements for creating distributed arrays.
For example, the following statements
REAL A(100), B(100), C(100,100)
*HPF$ TEMPLATE T(100,100)
*HPF$ DISTRIBUTE T (BLOCK, BLOCK)
*HPF$ ALIGN A(I) WITH T(I,*)
*HPF$ ALIGN B(I) WITH T(*,I)
*HPF$ DISTRIBUTE C (BLOCK, BLOCK)
determine the following aligning tree
A B
\ /
T
The following sequence of statements are generated for creating distributed object:
-
for template T
* initializing vector of the sizes of template dimensions
size(1) = size-of-N-th-dimension
. . .
size(N) = size-of-1-st-dimension
* creating abstract machine representation
iamv = crtamv(am,N,size,...)
* mapping abstract machine representation
* (on processor system)
it = distr(iamv,ps,...)
where am - current abstract machine reference,
ps - current processor system reference,
-
for array A with ALIGN attribute
* storing lower bounds of array dimensions in header of
* distributed array
A(N+3) = L1
. . .
A(2*N) = LN
* creating distributed array
it = crtda(A,i0000m,N,...)
. . .
* aligning (mapping) distributed array
it = align(A,iamvt,N,...)
-
for array C with DISTRIBUTE attribute
* initializing vector of the sizes of array dimensions
size(1) = size-of-N-th-dimension
. . .
size(N) = size-of-1-st-dimension
* creating abstract machine representation
iamv = crtamv(am,N,size,...)
* mapping abstract machine representation
* (on processor system)
it = distr(iamv,ps,...)
* storing lower bounds of array dimensions in header of
* distributed array
C(N+3) = L1
. . .
C(2*N) = LN
* creating distributed array
it = crtda(C,i0000m,N,...)
. . .
* aligning (mapping) distributed array
it = align(C,iamv,N,...)
The statements for creating distributed objects are inserted in a program before first executable statement.
For dynamic array addressed by POINTER variable the same sequence of statements is generated but it is inserted in program instead of the statement
pointer= ALLOCATE(...).
4.3 Translating executable directives and statements
4.3.1 PARALLEL directive
The regular parallel loop:
*DVM$ PARALLEL (I1, ..., In) ON A(…)...
DO label I1 = ...
. . .
DO label In = ...
loop-body
label CONTINUE
is translated into
[ REDUCTION-block-1 ]
* creating parallel loop
ipl = crtpl(n)
[ SHADOW-RENEW-block-1 ]
[ SHADOW-START-block ]
[ SHADOW-WAIT-block ]
[ REMOTE-ACCESS-block ]
* mapping parallel loop
it = mappl(ipl,A,...)
[ SHADOW-RENEW-block-2 ]
* inquiry of continuation of parallel loop execution
lab1 if(dopl(ipl) .eq. 0) go to lab2
DO label I1 = ...
. . .
DO label In = ...
loop-body
label CONTINUE
go to lab1
* terminating parallel loop
lab2 it = endpl(ipl)
[ REDUCTION-block-2 ]
If the REDUCTION clause appears in a PARALLEL directive, the REDUCTION‑block‑1 and REDUCTION-block-2 are generated.
REDUCTION‑block‑1:
* creating reduction group
irg = crtrg(...)
{
* creating reduction
irv = crtrgf(reduction-function, reduction-var,...)
* including reduction in reduction group
it = insred(irg,irv)
}... for each reduction in reduction-list
* storing values of reduction variables
it = saverv(irg)
REDUCTION‑block‑2:
* starting reduction group
it = strtrd(irg)
* waiting for completion of reduction group
it = waitrd(irg)
* deleting reduction group
it = delobj(irg)
If the SHADOW_RENEW clause appears in a PARALLEL directive, the SHADOW‑RENEW‑block‑1 and SHADOW‑RENEW-block-2 are inserted.
SHADOW‑RENEW‑block‑1:
* creating shadow edge group
ishg = crtshg(...)
{
* including shadow edge in the group
it = inssh(ishg,array,...)
}... for each array in renewee-list
* starting shadow edge group renewing
it = strtsh(ishg)
SHADOW‑RENEW‑block‑2:
* waiting for completion of shadow edge group renewing
it = waitsh(ishg)
The SHADOW_START and SHADOW_WAIT specification causes reordering parallel loop execution.
SHADOW‑START‑block:
it = exfrst(shadow-group-name)
The function exfrst( ) sets the following order of the parallel loop iterations:
-
Exported elements (original elements) of the local parts of the distributed arrays have been computed
-
The shadow edge group renew have been started;
-
Internal elements of the local parts of the distributed arrays have been computed.
SHADOW‑WAIT‑block:
it = imlast(shadow-group-name)
The function imlast( ) sets the following order of the parallel loop iterations:
-
Internal points of the local part of the distributed arrays have been computed;
-
Run-Time system await the completion of the shadow edge renewing.
-
The exported elements (original elements) of the local part of the distributed arrays have been computed;
The REMOTE-ACCESS-block is generated if the REMOTE_ACCESS clause is in PARALLEL directive.
Case of one element of array is a remote variable:
* copying of element of distributed array
it = rwelmf(array-header,buffer-var,...)
Case of n-dimensional section of array is a remote variable:
* creating buffer array
*
* creating abstract machine representation
iamv = crtamv(am,n,...)
* mapping abstract machine representation
it = distr(iamv,ps,...)
* storing lower bounds of array dimensions in header of
* distributed array
buffer(n+3) = L1
. . .
buffer(2*N) = LN
Характеристики
Тип файла документ
Документы такого типа открываются такими программами, как Microsoft Office Word на компьютерах Windows, Apple Pages на компьютерах Mac, Open Office - бесплатная альтернатива на различных платформах, в том числе Linux. Наиболее простым и современным решением будут Google документы, так как открываются онлайн без скачивания прямо в браузере на любой платформе. Существуют российские качественные аналоги, например от Яндекса.
Будьте внимательны на мобильных устройствах, так как там используются упрощённый функционал даже в официальном приложении от Microsoft, поэтому для просмотра скачивайте PDF-версию. А если нужно редактировать файл, то используйте оригинальный файл.
Файлы такого типа обычно разбиты на страницы, а текст может быть форматированным (жирный, курсив, выбор шрифта, таблицы и т.п.), а также в него можно добавлять изображения. Формат идеально подходит для рефератов, докладов и РПЗ курсовых проектов, которые необходимо распечатать. Кстати перед печатью также сохраняйте файл в PDF, так как принтер может начудить со шрифтами.















