fdvmLDe (Раздаточные материалы)
Описание файла
Файл "fdvmLDe" внутри архива находится в следующих папках: Раздаточные материалы, stage-fin. Документ из архива "Раздаточные материалы", который расположен в категории "". Всё это находится в предмете "модели параллельных вычислений и dvm технология разработки параллельных программ" из 7 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Онлайн просмотр документа "fdvmLDe"
Текст из документа "fdvmLDe"
76
Keldysh Institute of Applied Mathematics
Russian Academy of Sciences
Fortran-DVM
Version 2.0
Language description
April, 2001
Contents
1 Introduction 4
1.1 Parallel programming models 4
1.2 DVM-approach to parallel program development 6
2 Language overview 7
2.1 Programming model and model of parallelism 7
2.2 Syntax of FDVM directives 7
3 Virtual processor arrangements. PROCESSORS directive 9
4 Data mapping 10
4.1 DISTRIBUTE and REDISTRIBUTE directives 10
4.1.1 BLOCK format 11
4.1.2 GEN_BLOCK format 12
4.1.3 WGT_BLOCK format 12
4.1.4 Format of * 13
4.1.5 Multidimensional distributions 13
4.2 Distribution of dynamic arrays 13
4.2.1 Dynamic arrays in Fortran 77 program 13
4.2.2 Dynamic arrays in FDVM model. POINTER directive 14
4.2.3 DISTRIBUTE and REDISTRIBUTE directives for dynamic arrays 15
4.3 Distributing by aligning 16
4.3.1 ALIGN and REALIGN directives 17
4.3.2 TEMPLATE directive 19
4.3.3 Aligning dynamic arrays 19
4.4 DYNAMIC and NEW_VALUE directives 20
4.5 Default distribution 21
5 Distribution of computations 21
5.1 Parallel loops 21
5.1.1 Parallel loop definition 21
5.1.2 Distribution of loop iterations. PARALLEL directive 22
5.1.3 Private variables. NEW clause 23
5.1.4 Reduction operations and variables. REDUCTION clause 24
5.2 Computations outside parallel loop 26
6 Remote data specification 27
6.1 Remote references definition 27
6.2 SHADOW type references 28
6.2.1 Specification of array with shadow edges 28
6.2.2 Synchronous specification of independent references of SHADOW type for single loop 29
6.2.3 Computing values in shadow edges. SHADOW_COMPUTE clause 30
6.2.4 ACROSS specification of dependent references of SHADOW type for single loop 31
6.2.5 Asynchronous specification of independent references of SHADOW type 35
6.3 REMOTE type references 36
6.3.1 REMOTE_ACCESS directive 36
6.3.2 Synchronous specification of REMOTE type references 37
6.3.3 Asynchronous specification of REMOTE type references 38
6.3.4 Asynchronous copying by REMOTE type references 40
6.3.4.1 Loop and copy-statements 40
6.3.4.2 Asynchronous coping directives 41
6.3.4.2.1 ASYNCID directive 41
6.3.4.2.2 F90 directive 41
6.3.4.2.3 ASYNCHRONOUS and END ASYNCHRONOUS directives 42
6.3.4.2.4 ASYNCWAIT directive 42
6.4 REDUCTION type references 42
6.4.1 Synchronous specification of REDUCTION type references 42
6.4.2 Asynchronous specification of REDUCTION type references 43
7 Task parallelism 44
7.1 Declaration of task array 44
7.2 Mapping tasks on processors. MAP directive 45
7.3 Array distribution on tasks 45
7.4 Distribution of computations. TASK_REGION directive 45
7.5 Data localization in tasks 46
7.6 Fragment of static multi-block problem 47
7.7 Fragment of dynamic multi-block problem 47
8 COMMON and EQUIVALENCE 48
9 Procedures 49
10 Input/Output 51
11 Compatibility with HPF 51
12 The difference between FDVM1.0 and FDVM2.0 versions 52
References 52
Annex 1. Syntax rules 53
Annex2. Code examples 63
Example 1. Gauss elimination algorithm 63
Example 2. Jacobi algorithm 64
Example 3. Jacobi algorithm (asynchronous version) 65
Example 4. Successive over-relaxation 66
Example 5. Red-black successive over-relaxation 67
Example 6. Static tasks (parallel sections) 68
Example 7. Dynamic tasks (task loop) 70
1Introduction
1.1Parallel programming models
Three parallel programming models are now prevalent in large, scalable systems (see Fig. 1.1): message-passing model (MPM), shared-memory model (SMM) and data parallel model (DPM).
Message-passing model. In message passing model each process has its own local address space. Common data processing and synchronization are performed by message passing. Summarizing and standardization of different message-passing libraries resulted in MPI [1] standard development.
Shared-memory model. In shared memory model processes share common address space. Because there are no limitations to common data usage a programmer must explicitly specify common data and regulate an access to the data using synchronization tools. In high level languages logically independent threads are defined at the level of functional tasks or at level of loop iterations. Summarizing and standardization of shared memory models resulted in OpenMP [2] standard development.
Data parallel model. In data parallel model a process notion is absent and, as result, explicit message passing or explicit synchronization are absent. In the model data are distributed on nodes (processors) of computing system. Sequential program is translated by a compiler either in message passing model program or shared memory model program (Fig. 1.1). The computations are distributed according to the owner-computes rule: each processor performs only the computations of own data, that is the data, allocated on the processor.
In comparison with two previous models DPM has obvious advantages. A programmer is freed from tedious efforts of distributing the global array onto local arrays, explicit management of sending and receiving message, or explicit synchronization of parallel processes. But application area of the model is the object of research. The research results show that a performance of many algorithms of scientific computing in DPM model is comparable with the performance of the realization in MPM and SMM models.
HPF1 [3] development was the first attempt of DPM standardization. MPM and SMM model standardization is done on the base of large experience of implementations and practical applications summarizing. HPF1 standard was developed on the base of theoretical researches and 2-3 experimental implementations. Moreover, the standard was based on automatic parallelization of calculation and automatic synchronization of shared data access. First HPF1 implementations showed inefficiency of the standard for modern methods of calculations (in particular, for irregular calculations). In the next version of the standard HPF2 [2] a step was done to "manual" control of parallel execution performance. In particular, the distribution of computations and common reduction variable specification were defined.
Data parallel model | ||||||
| ||||||
Sequential program | ||||||
+ | ||||||
| ||||||
| Data | Data | Data | |||
Data mapping directives | ||||||
Message-passing model | Shared-memory model | |||||||||||||||||||||||||||||||||||
| Message-passing library | Synchronization library | ||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||
Process | Process | Process | Process | Process | Process | |||||||||||||||||||||||||||||||
Data | Data | Data | Data | |||||||||||||||||||||||||||||||||
Fig. 1.1. Three models of parallel programming
1.2DVM-approach to parallel program development
DVM-system provides unified toolkit to develop parallel programs of scientific-technical calculations in C and Fortran 77.
DVM parallel model. DVM parallel model is based on data parallel model. The DVM name reflects two names of the model - Distributed Virtual Memory and Distributed Virtual Machine. These two names show that DVM model is adopted both for shared memory systems and for distributed memory systems. DVM high level model allows not only to decrease cost of parallel program development but provides unified formalized base for supporting Run-Time System, debugging, performance analyzing and prediction.
Languages and compilers. As distinct from HPF the goal of full automation of computation parallelization and the common data access synchronization is not assumed in DVM system. Using high level specifications, a programmer has full control of parallel program performance. On the other hand, in the process of Fortran DVM design and development the compatibility with the subsets of HPF1 and HPF2 standards was kept.
Unified parallel model is built in C and Fortran 77 language on the base of the constructions, that are "transparent" for standard compilers, that allows to have single version of the program for sequential and parallel execution. C-DVM and Fortran DVM compilers translates DVM-program in C or Fortran 77 program correspondingly, including parallel execution Run-Time Support system calls. So only requirement to a parallel system is availability of C and Fortran 77 compilers.
Execution and debugging technique. Unified parallel model allows to have unified Run-Time Support system for both languages and, as result, unified system of debugging, performance analyzing and prediction. There are following modes of DVM-program execution and debugging:
-
Sequential execution and debugging using standard C and Fortran 77 compiler tools.
-
Pseudo-parallel execution on work station (WINDOWS and UNIX environment).
-
Parallel execution on parallel computer.
Following debugging modes are provided in pseudo-parallel and parallel execution modes:
-
automatic verification of parallel directive correctness;
-
tracing and comparing the results of parallel and sequential execution;
-
accumulation and visualization of data trace;
-
performance information accumulation and parallel execution performance prediction.
2Language overview
2.1Programming model and model of parallelism
Fortran DVM language is the extension of Fortran 77 language [2]. The extension is implemented via special comments, named directives. FDVM directives may be conditionally divided on three subsets:
-
Data distribution (sections 2, 3, 4, 8, 9)
-
Computation distribution (sections 5, 7)
-
Remote data specification (section 6)
FDVM parallel model is based on specific form of data parallelism called SPMD (Single Program, Multiple Data). In this model the same program is executed by all the processors concerned, but each processor performs its own subset of statements in accordance with the data distribution.
First, in FDVM model a user defines multidimensional arrangement of virtual processors, which sections data and computations will be mapped on. The section can be varied from the whole processor arrangement up to a single processor.