PPPA_pdes (Раздаточные материалы)

2019-09-18СтудИзба

Описание файла

Файл "PPPA_pdes" внутри архива находится в папке "Раздаточные материалы". Документ из архива "Раздаточные материалы", который расположен в категории "". Всё это находится в предмете "модели параллельных вычислений и dvm технология разработки параллельных программ" из 7 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Онлайн просмотр документа "PPPA_pdes"

Текст из документа "PPPA_pdes"

5


Parallel Program Performance Analyzer (PPPA)

(Preliminary design.)

1. Functions of PPPA

The performance analyzer is intended for the analysis and debugging of DVM-programs execution efficiency. With the performance analyzer the user can get the execution time characteristics of his program in more or less detail.

The efficiency of execution of the parallel programs on multiprocessor computers with the distributed memory is determined by the following major factors:

  • program parallelism - a part of parallel calculations in the total volume of calculations;

  • balance of processor load during parallel calculations;

  • time of interprocessor communications;

  • degree of overlapping of interprocessor communications and calculations.

    The DVM-system has an information whether sequential or parallel part of the program is executed on any processor at any moment. This approach is an essential advantage in comparison with the approaches based on explicit use of communication libraries (MPI, PVM). Besides, all synchronization operations of the program are known. Therefore there is an opportunity to quantify the influence of four above factors on the program execution efficiency. It is impossible to distinguish sequential calculations from parallel and to determine a degree of parallelism of the program, if the parallel program is submitted as a system of cooperating processes using communication libraries explicitly.

    The opportunity to distinguish sequential and parallel parts of the program during its execution on the multiprocessor computer allows the performance analyzer to give the user the following basic parameters of the parallel program execution:

    execution time;

    parallelism efficiency coefficient;

    lost time.

    Execution time is the maximum from the times of the program execution on each processor.

    To calculate the main characteristic of parallel execution ( parallelism efficiency coefficient) it is necessary to calculate two amount of time. First, an effective time required for the program execution on serial computer. Second, a total time of parallel execution, calculated as a product of execution time by number of processors. It is important to calculate total time of the parallel execution in this way (instead of the sum of times of execution on all processors) because all processors are allocated to the program at the moment of program start and are released after the end of program execution. Parallelism efficiency coefficient is a ratio of the effective time to the total time of parallel execution.

    The lost time is the total time of parallel execution subtracted by the effective time. If the programmer is not satisfied with parallelism efficiency coefficient he should analyze which components the lost time consists of.

    Thus, there are following independent components of the lost time:

  • Losses because of execution of sequential parts on all processors;

  • Losses because of processors loading imbalance;

  • Losses because of interprocessor communications.

These components are calculated using total time of parallel execution. Therefore the lost time includes also the times, when the processors spent idling , i.e. variations between the execution times on different processors results in increasing the lost time. These variations can occur due to different time of execution of common actions (reduction operations, execution of sequential or parallel parts) on different processors. Besides such imbalance of processors loading at different stages of the parallel program execution can result in growing processor synchronization time. Therefore imbalance characteristics should be calculated and given the user. Besides for collective operations requiring processors synchronization (reduction operations and loading of buffers), dissynchronization time is determined, as the time loss because of non simultaneous start of collective operation execution on different processors.

For more detailed analysis of the program efficiency user should be able to get characteristics of participation of each processor in execution of the parallel program. Besides the user can apply special language features to split program into intervals and to get performance characteristics for each interval.

2. The Content of PPPA

The performance analyzer consists of two subsystems - accumulation subsystem and subsystem of information processing.

The first subsystem provides accumulation of execution characteristics of parallel program on each processor. This subsystem is called from Lib-DVM during the parallel program execution. Besides C-DVM and Fortran DVM languages have the features for description of intervals of the program execution, for which the user would like to get the performance characteristics. The compiler inserts accumulation subsystem calls at the beginning and the end of each interval. The information from every processor outputs into a file upon the termination of the program.

The second subsystem running on a workstation, processes the information gathered on parallel computer and outputs performance characteristics requested by user.

3. Approach and Principle for PPPA Implementation

3.1. Splitting program into intervals

The execution of the program can be represented as a sequence of intervals. By default, the program is one interval. Also user can define intervals by means of C-DVM and Fortran DVM languages. There is also opportunity to set a mode of compilation, when all parallel loops are intervals.

The user can also split any interval into smaller intervals or to unite neighbor intervals (in order of execution) in a new one, i.e. to present the program as hierarchy of intervals of several levels (the whole program is a interval of highest level).

The mechanism of splitting the program into intervals serves for more detailed analysis of behavior of the program during its execution. Looking through results with the help of the performance analyzer, user can set the depth of details to leave out of consideration intervals of the requested .

3.2. Accumulation of the information

The following information is stored in memory and written into a file on each processor:

  1. The information on runtime system and hardware-software environment (dimensions of multiprocessor system, number of processors, type of communication system, the number of the input-output processor, and also the number of the processor executing reduction operations).

  2. The information on each interval (type of the interval, the number of the interval, the number of the level, number of entries to the interval, the name of a source file and the number of a line in it, corresponding to the beginning of the interval) and the amounts of time concerning this interval:

  • Time of execution TTotal _execution_time.

  • Time of exchanges with files TFile_exchange.

  • Time of messages passing during the execution of the functions of exchange with files TMessages_in_ file_exchange.

  • Waiting time of execution of a reduction TReduction_waiting.

  • Waiting time of updating of shadow edges TShadow_waiting.

  • Time of buffer loading to access the remote elements of the distributed arrays TLoad_buffer.

  • Time of moving of data during redistribution of arrays TArray_distribution.

  • Processor time of replicated calculations TReplicated_CPU_time.

    It is a part of time of execution, when the processor executes a sequential branch or the same parallel branch (the same iterations of a parallel loop) together with other processors, minus times of exchanges and waiting times.

  • Processor time of the distributed calculations TDistributed _CPU_time.

It is a part of time of execution, when the processor executes a parallel branch independently (iterations of a parallel loop), minus times of exchanges and waiting times.

  1. The information on times of operations requiring synchronization (start and waiting of a reduction, start and waiting of updating of shadow edges, loading of buffers).

3.3. Processing the information

The information saved in file during parallel execution of the program is processed then on a workstation as follows.

3.3.1. Calculation of the comparative characteristics of execution on different processors

For each of the program execution characteristics, accumulated on different processors, two processors are determined, which parameters are maximal and minimal. For each characteristic the numbers of these two processors and three values of the characteristic - minimal, maximal and average are given.

3.3.2. Calculation of the characteristic dissynchronization of collective operations

For collective operations requiring synchronization of processors (reduction, updating of shadow edges), the appropriate dissynchronization times are determined for each processor - time loss due to non simultaneous start of collective operation execution on different processors.

  • Dissynchronization time of reduction execution Ti_Red_synchronization.

    Tmax = max (Ti_ret_REDUCTION_START)

    Ti_Red_synchronization = max (0, T max – Ti_call_REDUCTION_WAIT)

    i - the number of the processor

    The time of the beginning of reduction execution is calculated as the time of call of reduction operation by the last of the processors. For each processor Pi, which executed operation of reduction termination before the time of beginning of reduction execution , the of dissynchronization time is equal to a difference T max – T i.

  • Dissynchronization time of updating shadow edges Ti_Shadow_synchronization.

Tmax = max (Tj_ret_SHADOW_START, Ti_call_SHADOW_WAIT),

j - the numbers of the neighbor processors exporting the data to i processor

Ti_Shadow_synchronization = Tmax – Ti_call_SHADOW_WAIT

For each processor, which executed the operation SHADOW_WAIT, before all its neighbors executed operation SHADOW_START, the dissynchronization time is equal to difference between the time when by the last neighbor processors executed operation SHADOW _ START and the time of SHADOW_WAIT.

3.3.3. Calculation of the general characteristics

These characteristics concern all parallel program and its intervals and are calculated using the characteristics received on each processor. According to the degree of details, given by the user, (level of intervals) the following information on performance execution of the program can be given.

  1. Time of execution ТTotal_execution_time.

TTotal_execution_time = max (Ti_Total_execution_time),

Where i - the number of the processor.

It is the maximal time from the times of execution on each processor.

  1. Time of exchanges with files ТFile_exchange_time on all processors.

TFile_exchange_time = (Ti_File_exchange - Ti_Messages_in_file_exchange),

This is the time of execution of exchanges with files minus losses of time on message passing during execution of functions of exchange with files.

  1. Processor time ТCPU_time.

TCPU_time = Tnio_Replicated_CPU_time + i_Distributed_CPU_time,

Where nio - the number of input/output processor

N - the number of processors

This is processor time of replicated calculations on the input-output processor plus processor times of the distributed calculations of all processors.

  1. Effective time of execution on one processor TSequential_execution_time.

TSequential_execution_time = TCPU_time + TFile_execution_time

  1. Parallelization efficiency coefficient KParallelization_efficiency.

KParallelization_efficiency = TSequential_execution_time / (TTotal_execution_time * N),

Where N - number of processors.

Свежие статьи
Популярно сейчас
Зачем заказывать выполнение своего задания, если оно уже было выполнено много много раз? Его можно просто купить или даже скачать бесплатно на СтудИзбе. Найдите нужный учебный материал у нас!
Ответы на популярные вопросы
Да! Наши авторы собирают и выкладывают те работы, которые сдаются в Вашем учебном заведении ежегодно и уже проверены преподавателями.
Да! У нас любой человек может выложить любую учебную работу и зарабатывать на её продажах! Но каждый учебный материал публикуется только после тщательной проверки администрацией.
Вернём деньги! А если быть более точными, то автору даётся немного времени на исправление, а если не исправит или выйдет время, то вернём деньги в полном объёме!
Да! На равне с готовыми студенческими работами у нас продаются услуги. Цены на услуги видны сразу, то есть Вам нужно только указать параметры и сразу можно оплачивать.
Отзывы студентов
Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.
Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.
Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.
Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.
Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.
Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.
Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.
Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.
Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.
Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.
Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.
Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.
Популярные преподаватели
Добавляйте материалы
и зарабатывайте!
Продажи идут автоматически
5259
Авторов
на СтудИзбе
421
Средний доход
с одного платного файла
Обучение Подробнее