deb_guid-e (1158372), страница 3
Текст из файла (страница 3)
By default, startup is controlled by the parameters of the execution trace accumulation from the base file usrdebug (see section 11.2), corrected by the following parameters from the file deb_trc.par:
| EnableDynControl=0; | - disable dynamic control; |
| EnableTrace=1; | - enable trace accumulation; |
| TraceOptions.TraceMode=1; | - trace accumulation mode |
In case trace accumulation errors are found, diagnostics about existence of such errors is outputted to stderr stream. This stream can be directed either to the screen or into a file (see sections Error: Reference source not found and Error: Reference source not found).
Diagnostics about error type, line of the source text and numbers of iterations of all nesting loops can be also output either to the screen, or into a file (see section 11.2). A structure of accumulated trace is presented in section 14.
The structure and list of error messages of trace accumulation is given in section 12.2.
10.5Comparing reference trace with results of parallel program execution on single processor.
When comparing reference trace with the trace of parallel execution of the program on single processor, the correctness of reduction operation descriptions is checked. It is carried out by means of a special mode of parallel execution of the program on single processor. In this mode the reduction variables are calculated according to the reduction operation descriptions given by programmer. The reduction variables are calculated in the way of emulation of each iteration execution on a separate processor. At the beginning of iteration, the initial value is assigned to the reduction variable. This value is kept when entering loop. Upon end of iteration, Lib-DVM is invoked to calculate the final value of the reduction according to specified by the user reduction function. If the user specifies the reduction function incorrectly, the differences in traces, obtained in different modes of reduction computation must occur.
For trace comparing the following command is used:
dvm red <DVM-program_name>
By default, startup is controlled by the trace comparison parameters from base file usrdebug (see section 11.2), corrected by the following parameters from the file deb_red.par:
| EnableDynControl=0; | - disable dynamic control; |
| EnableTrace=1; | - enable trace accumulation; |
| TraceOptions.TraceMode=3; | - trace comparison mode; |
| ManualReductCalc=1; | - computation of reduction variables according to the user specifications. |
In case trace comparison errors are found, diagnostics about existence of such errors is outputted to stderr stream. This stream can be directed either to the screen or into a file (see sections 7 and 11.3).
Diagnostics about error type, line of the source text and numbers of iterations of all nesting loops can be also outputted either to the screen, or into a file (see section 11.2).
The structure and list of error messages of trace comparison is given in section 12.2.
10.6Comparing parallel execution trace with reference one.
The parallel program is started in a mode of emulation of multiprocessor system on a workstation and the execution trace comparison with the reference one. The following command is used:
dvm dif N1 [N2 [N3]] [<cluster_options>] <DVM-program_name>
where N1, N2, N3 - sizes of processor matrix (1 1 1 by default).
By default, startup is controlled by the parameters of trace comparison from the base file usrdebug (see section 11.2), corrected by the following parameters from the file deb_dif.par:
| EnableDynControl=0; | - disable dynamic control; |
| EnableTrace=1; | - enable trace accumulation; |
| TraceOptions.TraceMode=3; | - trace comparison mode; |
| ManualReductCalc=0; | - computation of reduction variables according to standard algorithm. |
The reduction variables are calculated in standard way. All computations of reduction variable on one processor are performed by the statements of iterations, executed on the processor. The final result of reduction operation from partial results obtained on all the processors is calculated by Lib-DVM. If a program is performed on a single processor, only the program statements will calculate the reduction.
In case trace comparison errors are found, diagnostics about existence of such errors is outputted to stderr stream. This stream can be directed either to the screen or into a file (see sections 7 and 11.3).
Diagnostics about error type, line of the source text and numbers of iterations of all nesting loops can be also directed either to the screen, or into a file (see section Error: Reference source not found).
The structure and list of error messages of trace comparison is given in section 12.2.
If there is no differences in the traces the program can be executed with real data (see section 10.8).
If differences are detected, but the error in program is failed to find using reference trace and trace comparison diagnostics, the user can accumulate trace on each processor during executing parallel version of the program on required processor matrix (see section 10.7).
If during the parallel program execution (or during its emulation on one workstation) error situations will occur on some processor (or differences in reference and current traces will be detected) the program can hang-up. If to terminate program execution by CTRL-C, the standard output streams directed into the files can be loused. In this case stderr stream should not be directed into the files.
A point of hang-up or abnormal program termination can be detected, if to enable the program system trace before the program startup (see section 11.4). Last records in system trace allow determining the program point, where crash situation occurred.
10.7Accumulating parallel program trace.
The following command is used:
dvm ptrc N1 [N2 [N3]] [<cluster_options>] <DVM-program_name>
where N1, N2, N3 - sizes of processor matrix (1 1 1 by default).
By default, startup is controlled by the parameters of the user trace accumulation from the base file usrdebug (see section 11.2), corrected by the following parameters from the file deb_trc.par:
| EnableDynControl=0; | - disable dynamic control; |
| EnableTrace=1; | - enable trace accumulation; |
| TraceOptions.TraceMode=1; | - trace comparison mode; |
In case trace accumulation errors are found, diagnostics only about existence of such errors is outputted to stderr stream. This stream can be directed either to the screen or in a file (see sections 7 and 11.3).
Diagnostics about error type, line of the source text and numbers of iterations of all nesting loops can be also directed either to the screen, or into a file (see section 11.2). The trace is accumulated on each processor in the separate file, for example, with names 0.trd, 1.trd, 2.trd and so on.
The structure of accumulated trace files is presented in section 14.
10.8Parallel execution with real data.
If no differences are detected at previous steps it is possible to consider the program working correctly on test parameters. Now the user can proceed to parallel execution of the program on workstation cluster with real parameters.
The following commands are used:
for compilation:
dvm c [C-DVM-converter_options] <DVM-program_name>
dvm f [F-DVM-converter_options] <DVM-program_name>
for execution:
dvm run [N1 [N2 [N3]]] [<cluster_options>] <DVM-program_name>
where N1, N2, N3 - sizes of processor matrix (1 1 1 by default).
By default, startup is controlled by the parameters from the sets, specified in environment variables dvmpar and usrpar.
If during the program execution with real parameters execution results are not satisfied the user, he can again obtain sequential and parallel program versions to trace the program with real data. But it is necessary to take into account, that
-
estimating trace size can be very large. Therefore before tracing it is required to estimate trace size as for whole program as for its separate parts (see sections 10.9 and 10.10).
-
csdeb, fsdeb and cpdeb, fpdeb commands (see section 10.2) use the converter options –d4 by default. It can essentially increase program execution time. So to obtain debug program versions for startup with real data (for example, for execution trace accumulation and comparison) other options ((-d1, -d2 or -d3) should be used for whole program or for its parts (see C-DVM and Fortran-DVM compiler user's guides).
10.9Estimating trace size.
The following command is used:
dvm size <DVM-program_name>
By default, startup is controlled by the parameters of the user trace accumulation from the base file usrdebug (see section 11.2), corrected by the following parameters from the file deb_size.par:
| EnableDynControl=0; | - disable dynamic control; |
| EnableTrace=1; | - enable trace accumulation; |
| TraceOptions.TraceMode=0; | - mode of generation of loop description file; |
The command creates so called loop description file, containing, in particular, predicted sizes of trace, with taking into account specified DVM-converter options (see section Error: Reference source not found) and trace accumulation levels (see section 11.2).
Really, only two parameters TraceOptions.TraceLevel and TraceOptions.WriteEmptyIter from the base parameter set and a loop description file (described below) control the trace size.
10.10Controlling size of trace file.
Created loop description file may be modified by the user to decrease trace size. The user may set the mode of selective accumulation of the trace, completely or partially canceled accumulation of tracing for some (or all) loops. Then the command dvm size should be performed again to estimate the size of the trace. If results are not acceptable the process should be repeated.
The loop description file contains:
-
calculated values of the trace file size and number of the file lines.
-
information for each program loop (sequential or parallel) with taking into account their enclosure level.
The trace entity is a program loop. Information for each program loop contains:
-
loop header;
-
calculated values of the trace size, a number of trace lines and a number of traced iterations for the loop;
-
loop end.
Loop header contains:
-
loop type - sequential (SL) or parallel (PL);
-
loop number;
-
nesting loop number (if the loop is nested);
-
loop rank;
-
reference to source program code (file name and loop beginning line number);
-
sign "=";
-
parameters, controlling loop (can be omitted);
-
commentary beginning from sign "#" and prolonging up to the end of loop header line (can be omitted).
The parameters, controlling loop (that can be modified by a user) are:
-
Trace accumulation level. If it is not specified the level for whole program defined by parameter TraceOptions.TraceLevel (see section 11.2) is used.
-
Traced iterations. If they are not specified, then all loop iterations are traced. If an iteration is excluded from tracing then reading from and writing to the variables inside the iteration is not traced, but record corresponding to the iteration beginning are put into the trace, if parameter TraceOptions.WriteEmptyIter is equal to 1 (by default it is equal to 0).
Changing loop controlling parameters influence on calculated size of the trace of the separate loops and therefore on the size of full trace, the number of trace lines and the number of traced loop iterations.
One of the following trace accumulation levels can be specified as for whole program as for each loop:
-
trace is disabled (NONE level)
-
tracing loops and iterations only (MINIMAL level).
-
tracing variable modifications (MODIFY level).
-
full tracing (FULL level).
The traced iterations are specified in the following way:















