fdvmLDe (1158336), страница 7

Файл №1158336 fdvmLDe (Раздаточные материалы) 7 страницаfdvmLDe (1158336) страница 72019-09-182019-09-18СтудИзба

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 7)

Shadow edges for multidimensional distributed arrays can be specified for each dimension. A special case is when it is required to renew "a corner" of shadow edges. In such a case additional CORNER parameter is needed.

Example 6.2. Specification of SHADOW-references with corner elements

REAL A(100,100), B(100,100)

CDVM$ ALIGN B(I,J) WITH A(I,J)

CDVM$ DISTRIBUTE A (BLOCK,BLOCK)

. . .

CDVM$ PARALLEL (I,J) ON A (I,J), SHADOW_RENEW (B(CORNER))

DO 10 I = 2, 99

DO 10 J = 2, 99

A(I,J) = (B(I,J+1) + B(I+1,J) + B(I+1,J+1) ) / 3

10 CONTINUE

The width of shadow edges of the array B is equal to 1 for both dimensions by default. As "corner" reference B(I+1,J+1) exists, the CORNER parameter is specified.


										shadow edges


										sent values


										internal area


										corner elements

Fig. 6.2. Scheme of array local section with shadow edges.

6.2.3Computing values in shadow edges. SHADOW_COMPUTE clause

In sections 6.2.1 and 6.2.2 the ways to renew values in shadow edges by data exchange between processors were described. At that new values of an array are calculated in one loop, but renewing is performed either before execution or during execution of other loop.

Under some conditions the values in shadow edges can be renewed without data exchange between processors. New values of shadow edges can be calculated in the same loop, where new values of the arrays are calculated. This way of shadow edge renewing is described by SHADOW_COMPUTE specification, which is clause of PARALLEL directive.

Let consider a parallel loop

CDVM$ PARALLEL (I1,I2,…,In) ON A(f1,f2,…,fk)

where A is identifier of the array, the loop iterations are distributed according to.

Let

{LH} - set of references to distributed arrays in left parts of the assignment statements of

the loop body.

{RH} - set of references to distributed arrays in right parts (expressions) of the assignment

statements of the loop body.

The following conditions must be satisfied to apply SHADOW_COMPUTE specification:

the width of shadow edges of distributed dimensions of arrays from {LH} and {RH} must be less than the width of shadow edges of corresponding dimensions of array A;
shadow edges of arrays from {RH} must be filled with the values, corresponding to values of the arrays.

During the loop execution the values of shadow edges of arrays from {LH} are renewed. The width of renewed part of every shadow edge is equal to the width of corresponding shadow edge of array A.

Example 6.3. SHADOW_COMPUTE specification.

CDVM$ DISTRIBUTE (BLOCK) :: A, B, C, D

CDVM$ SHADOW A(1:2)

CDVM$ SHADOW B(2:2)

CDVM$ PARALLEL ( I ) ON C( I ), SHADOW_COMPUTE,

CDVM$* SHADOW_RENEW( A, B )

DO 10 I = 1, N

C(I) = A(I) + B(I)

D(I) = A(I) - B(I)

10 CONTINUE

As by default the width of shadow edges for C and D array is 1, the condition 1) is satisfied. Performing of SHADOW_RENEW specification satisfies the condition 2).

6.2.4ACROSS specification of dependent references of SHADOW type for single loop

Consider the following loop

DO 10 I = 2, N-1

DO 10 J = 2, N-1

A(I,J) = (A(I,J-1) + A(I,J+1) + A(I-1,J) + A(I+1,J)) / 4

10 CONTINUE

Data dependence exists between loop index i1 and i2 ( i1<i2 ), if both these iterations refer to the same array element by write-read or read-write scheme.

If iteration i1 writes a value and iteration i2 reads this value, then flow dependence, or simply dependence exists between the iterations.

If iteration i1 reads the "old" value and iteration i2 writes the "new" value, then anti-dependence i1 i2 exists between the iterations.

In both cases iteration i2 can be executed only after iteration i1.

The value i2 - i1 is called a range or length of dependence. If for any iteration i dependent iteration i + d (d is constant) exists, then the dependence is called regular one, or dependence with constant length.

The loop with regular computations, with regular dependencies on distributed arrays, can be distributed with PARALLEL directive, using ACROSS clause.

across-clause	is ACROSS ( dependent-array-list )

dependent-array	is dist-array-name ( dependence-list ) [(section-spec-list)]

dependence	is flow-dep-length : anti-dep-length

flow-dep-length	is int-constant

anti-dep-length	is int-constant

section-spec	is SECTION (section-subscript-list)

All the distributed arrays with regular data dependence are specified in ACROSS clause. The length of flow dependence (flow-dep-length) and the length of anti-dependence (anti-dep-length) are specified for each dimension of the array. There is no data dependence, if length is equal to zero.

Constraint:

In each array reference, data dependence may appear only in one distributed dimension. For example, A(I-1,J), A(I,J+1) references are allowed, but A(I‑1,J+1), A(I+1,J-1) references are forbidden.

If only the values of the sections of array are renewed in the loop but not whole array then these sections should be specified by SECTION clause.

Example 6.4. Specification of the loop with regular data dependence.

CDVM$ PARALLEL (I,J) ON A(I,J), ACROSS (A(1:1,1:1))

DO 10 I = 2, N-1

DO 10 J = 2, N-1

A(I,J) = (A(I,J-1) + A(I,J+1) + A(I-1,J) + A(I+1,J))/4

10 CONTINUE

Flow and anti- dependencies of length 1 exist for all dimensions of the array A.

ACROSS specification is implemented via shadow edges. The length of anti-dependence defines width of high edge renewing, and length of flow-dependence defines width of low edge renewing. High edges are renewed prior the loop execution (as for SHADOW_RENEW directive). Low edges are renewed during loop execution as remote data calculation proceeds. Actually, ACROSS-references are subset of SHADOW-references, that have data dependence.

Efficiency of parallel execution of ACROSS loop

An array dimension with data dependence will be called recurrent dimension.

Efficiency degree of ACROSS loop parallel execution depends on the number of distributed recurrent dimensions.

One-dimensional array. For one-dimensional distributed array with recurrent dimension only sequential execution is possible (see fig. 6.3).

Multi-dimensional array. For multi-dimensional array the following combinations of recurrent distributed dimensions can be selected according to degree of efficiency decreasing.

At least one non-recurrent dimension exists. The array and the loop are distributed along this dimension only. The loop is executed as usual parallel loop without ACROSS specification.

Example 6.5. Non-recurrent dimension parallelization.

CDVM$ DISTRIBUTE A( BLOCK, * )

CDVM$ PARALLEL ( I ) ON A( I, * )

DO 30 I = 1,N1

DO 30 J = 2,N2-1

30 A(I,J) = A(I,J-1) + A(I,J+1)

Note, that this way may be not most efficient, if N1 is much less then N2 and the number of processors (insufficient parallelism).

Only one recurrent dimension is distributed. Other dimensions are localized on every processor. Run-time system performs pipeline parallelization (see fig. 6.4). The size of pipeline step is defined on every computer automatically, depending on the loop execution time and the time of data passing when renewing shadow edges.

Example 6.6. Pipeline parallelization.

CDVM$ DISTRIBUTE A( BLOCK, * )

CDVM$ PARALLEL ( I, J ) ON A( I, J ), ACROSS( A( 1:1, 1:1 ))

DO 40 I = 2,N1-1

DO 40 J = 2,N2-1

40 A(I,J) = A(I,J-1) + A(I,J+1) + A(I-1,J) + A(I+1,J)

Constraint of pipeline parallelization. For implementation of pipeline parallelization the following additional conditions must be satisfied:

PARALLEL directive specifies two loop headers as minimum. One loop specifies distributed recurrent dimension, other loop specifies a local dimension of the array.
If several arrays are specifying in ACROSS loop, these arrays should be aligned along recurrent distributed dimension and along the local dimension, indexed by the parallel loop.

m>1 recurrent dimensions exist. The virtual processor arrangement contains m dimensions too. Run-time system automatically organizes parallel execution on hyper-planes of processor arrangement. Every hyper-plane has m-1 dimensions.

Example 6.7. Hyper-plane parallelization.

CDVM$ DISTRIBUTE A( BLOCK, BLOCK )

CDVM$ PARALLEL ( I, J ) ON A( I, J ) , ACROSS ( A( 1:1, 1:1 ))

DO 50 I = 2,N1-1

DO 50 J = 2,N2-1

50 A(I,J) = A(I,J-1) + A(I,J+1) + A(I-1,J) + A(I+1,J)

Two-dimensional arrangement of virtual processors is represented on fig. 6.5. The computations on processors, belonging to one hyper-plane (diagonal line), will be executed in parallel.


p0	t1




p1	t2




p2	t3

Fig. 6.3. Sequential execution


p0	t1	t2	t3




p1	t2	t3




p2	t3

Характеристики

Тип файла

Документ

Размер

605,5 Kb

Материал

Раздаточные материалы

Тип материала

Другое

Предмет

Модели параллельных вычислений и DVM технология разработки параллельных программ

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов учебной работы

razdatochnye-materialy.rar

Раздаточные материалы

SAGE

stage-fin

2000_11_NT

doc

Installation guide (rus).doc

Installation guide.doc

User guide (rus).doc

User guide.doc

2000_11_UNIX

doc

Installation guide (rus).doc

Installation guide.doc

User guide (rus).doc

Полное содержание архива

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.