rtsIDe (1158448), страница 17
Текст из файла (страница 17)
The size of distributed array element in bytes is returned.
13.4.5Coping element of local part of distributed array to element of local part of other distributed array
| long clocel_( | long | FromArrayHeader[], |
| FromArrayHeader | | header of read distributed array. |
| FromIndexArray | | ditributed array, which i-th element is index value of read element of the distributed array on (i+1)-th dimension. |
| ToArrayHeader | | header of the other distributed array, the element will be assigned by the read value. |
| ToIndexArray | | array, which j-th element is index value of updated element of the distributed array on (i+1)-th dimension. |
The funcôion can be executed successfully only by the processor, in whose memory the read and written elements are allocated. The types of read and written elements must be the same.
The number of copied bytes is returned.
13.4.6Requesting address of element of local part of distributed array
| char *GetLocElmAddr( | long | ArrayHeader[], |
| ArrayHeader | | header of the distributed array. |
| IndexArray | | array, which i-th element is index value of element of distributed array on (i+1)-th dimension. |
The function can be executed successfully only by the processor, in whose memory the specified element is allocated.
The pointer to the first byte of the element is returned.
13.5Macros to access elements of local part of distributed array of rank from1 to 7
The following macros to access the elements from the local part of the distributed array with rank from 1 to 7 can be used in C ðrograms:
| <DAElmType> DAElm<Rank> ( | long | ArrayHeader[], |
| ArrayHeader | | header of the distributed array. |
| Rank | | the rank of the distributed array. |
| DAElmType | | the type of the element of the distributed array. |
| Indexi | | index value of the requested element on the i-th dimension of the distributed array. |
Each of these macros is L-value in the C language.
The access to local part of distributed array by means of macros is more effective, then the access by the functions, described in section 13.4.
It is assumed that the array with the header ArrayHeader was created with the base pointer equal to NULL.
13.6Sequential requesting index values of distributed array elements.
| long setind_ ( | long | ArrayHeader[], |
| ArrayHeader | | distributed array header. |
| InitIndexArray | | array, which i-th element is initial value of set index of the distributed array element for (i+1)-th dimension. |
| LastIndexArray | | array, which i-th element is last value of set index of the distributed array element for (i+1)-th dimension. |
| StepArray | | array, which i-th element is index step of (i+1)-th dimension when sequential requesting of indexes is done. |
The function setind_ sets initial and last values and steps of indexes of the distributed array elements for the following requesting and updating of indexes by the function getind_ considered below.
For full coverage of the distributed array dimension without requesting size of the object for the specified dimension (see section 17.2) the initial value of the index must be equal to -1. In this case it is considered that initial value of the index is equal to 0, the step is equal to 1 and last value is equal to the size of the array for given dimension minus 1.
The function returns zero.
| long getind_ ( | long | ArrayHeader[], |
| ArrayHeader | | distributed array header. |
| NextIndexArray | | array, which i-th element is assigned by the next index value for the (i+1)-th dimension. |
The function getind_ is intended for sequential requesting next values of the distributed array element indexes. When the function is called first time, the indexes, set by the function setind_ are returned. After writing to the array NextIndexArray the index values are updated according to steps, specified by the function setind_. The index of dimension with larger number is changed faster then the index of the dimension with lesser number (according to C language rules).
Non-zero value is returned, if next indexes are requested. Zero value is returned, if subset of distributed array elements, specified by the function setind_ is exhausted.
14Regular access to remote data
If distributed array elements are located not on all the processors, executing current branch of the parallel program, where they are required, such array will be called remote, and its elements will be called remote ones. An access to remote distributed array elements is implemented by creating on each processor, where at least one element is required and absent, a special buffer (local buffer of remote elements) and putting in it required remote data.
14.1Creating remote element buffer of distributed array
If parallel program branch, required remote array elements is parallel loop iteration, the remote data, required for all its iterations will be considered as a single whole.
In this case required elements of remote distributed array are defined by linear sampling rules (regular access), specified for each its dimension as
Ai*Vk(i) + Bi , where:
| I | | a number of the remote distributed array dimension; |
| Vk(i) | | index variable of parallel loop k-th dimension, varied in a range of its initial and last values; |
| Ai, Bi | | integer constants. |
Local parts of remote elements of each processor, executing parallel loop iterations, are considered as local parts of distributed array, which will be called global buffer of remote elements (or remote element buffer). The remote element buffer is mapped just as the parallel loop, but its i-th dimension is matched with k(i)-th dimension of the parallel loop (in the case, when linear sampling rule with non-zero Ai is specified for i-th dimension of the remote array).
By virtue of sampling rules considered above required remote elements are represented as stretched block, but put to the buffer in compact (not stretched) mode.
Dealing with remote element buffer is similar to dealing with distributed array (see section Error: Reference source not found).
For the purpose of optimization the remote element buffers can be combined the groups and loading all buffers of the group is executed by one operation (see secôions 14.6-14.10).
| long crtrbl_ ( | long void long long long | RemArrayHeader[], *BasePtr, *LoopRefPtr, AxisArray[], |
| RemArrayHeader | | header of the remote distributed array. |
| BufferHeader | | header of a remote element buffer to be created. |
| BasePtr | | base pointer for access to the remote element buffer. |
| *StaticSignPtr | | flag of static buffer creation. |
| *LoopRefPtr | | reference to parallel loop, requiring remote array elements, located in the buffer. |
| Axisrray | | array, which i-th element is dimension number (k(i+1)), matched with (i+1)-th dimension of the remote array. |
| CoeffArray | | array, which i-th element is coefficient of index variable of the linear sampling rule for (i+1)-th dimension of the remote array (Ai+1). |
| ConstArray | | array, which i-th element is constant of index variable of the linear sampling rule for (i+1)-th dimension of the remote array (Bi+1). |
The function crtrbl_ creates buffer to locate remote elements of distributed array with RemArrayHeader header, required to execute parallel loop, defined by *LoopRefPtr reference. Remote array must be mapped on the processor system, which each element belongs to the current processor system. When the function is invoked, the loop must be current and mapped.
Created buffer is the distributed array with rank, lesser than remote array rank by a number of constant linear rules of remote element sampling. Its header BufferHeader is an array of 2*r+2 elements of "long" type, where r is the buffer rank (extended header of distributed array, see section 13.4). The buffer header allocation in memory (static or dynamic) is perfumed by a user program, and the buffer initialization is performed by Run-Time System when executing the function crtrbl_.
Any i-th element of AxisArray array can contain parallel loop dimension number or 0 or -1. In the first case CoeffArray[i] must be non-zero (not constant sampling rule). In the second case CoeffArray[i] must be equal to zero (constant sampling rule). The third case defines free (i+1)-th dimension of the remote array, i.e. the dimension, not matched with any parallel loop dimension ("presents everywhere").
So, remote element buffer rank is equal to a number of non-zero elements of AxisArray array. A size of each j-th dimension of the buffer is defined by j-th non-zero element of AxisArray array (let i be this element index). If AxisArray[i] > 0 (linear sampling rule), the size of j-th dimension of the buffer is equal to the size of AxisArray[i])-th dimension of specified parallel loop (in this case the loop dimension size is equal to absolute difference between initial and last values of the dimension index variable). If AxisArray[i] = -1 (free dimension of the distributed array), the size of the buffer j-th dimension is equal to (i+1)-th dimension of remote array.















