Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 90
Файл "Cooper_Engineering_a_Compiler(Second Edition)" внутри архива находится в следующих папках: Rice, Купер и Торчсон - перевод. PDF-файл из архива "Rice", который расположен в категории "разное". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 90 страницы из PDF
Consider the array A[1. . .2,1. . .4]. Conceptually, it looks likeA1,1 1,2 1,3 1,42,1 2,2 2,3 2,4In linear algebra, the row of a two-dimensional matrix is its first dimension, and the column is its second dimension. In row-major order, theelements of a are mapped onto consecutive memory locations so that adjacent elements of a single row occupy consecutive memory locations. Thisproduces the following layout:1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4The following loop nest shows the effect of row-major order on memoryaccess patterns:for i ← 1 to 2for j ← 1 to 4A[i,j] ← A[i,j] + 1In row-major order, the assignment statement steps through memory insequential order, beginning with A[1,1], A[1,2], A[1,3], and on throughA[2,4].
This sequential access works well with most memory hierarchies.Moving the i loop inside the j loop produces an access sequence that jumpsbetween rows, accessing A[1,1], A[2,1], A[1,2],..., A[2,4]. For asmall array like a, this is not a problem. For arrays that are larger than thecache, the lack of sequential access could produce poor performance in thememory hierarchy. As a general rule, row-major order produces sequentialaccess when the rightmost subscript, j in this example, varies fastest.362 CHAPTER 7 Code ShapeFORTRAN uses column-major order.The obvious alternative to row-major order is column-major order. Itkeeps the columns of a in contiguous locations, producing the followinglayout:1,1 2,1 1,2 2,2 1,3 2,3 1,4 2,4Column-major order produces sequential access when the leftmost subscriptvaries fastest.
In our doubly nested loop, having the i loop in the outer position produces nonsequential access, while moving the i loop to the innerposition would produce sequential access.A third alternative, not quite as obvious, has been used in several languages.This scheme uses indirection vectors to reduce all multidimensional arraysto a set of vectors.
For our array a, this would produceA1,1 1,2 1,3 1,42,1 2,2 2,3 2,4Each row has its own contiguous storage. Within a row, elements areaddressed as in a vector. To allow systematic addressing of the row vectors,the compiler allocates a vector of pointers and initializes it appropriately. Asimilar scheme can create column-major indirection vectors.Indirection vectors appear simple, but they introduce their own complexity.First, indirection vectors require more storage than either of the contiguousstorage schemes, as shown graphically in Figure 7.10. Second, this schemerequires that the application initialize, at runtime, all of the indirection pointers.
An advantage of the indirection vector approach is that it allows easyimplementation of ragged arrays, that is, arrays where the length of the lastdimension varies.Each of these schemes has been used in a popular programming language.For languages that store arrays in contiguous storage, row-major order hasbeen the typical choice; the one notable exception is fortran, which usescolumn-major order. Both bcpl and Java support indirection vectors.7.5.3 Referencing an Array ElementPrograms that use arrays typically contain references to individual array elements. As with vectors, the compiler must translate an array reference intoa base address for the array’s storage and an offset where the element islocated relative to the starting address.7.5 Storing and Accessing Arrays 3631,1,1 1,1,2 1,1,3 1,1,41,2,1 1,2,2 1,2,3 1,2,4B1,3,1 1,3,2 1,3,3 1,3,42,1,1 2,1,2 2,1,3 2,1,42,2,1 2,2,2 2,2,3 2,2,42,3,1 2,3,2 2,3,3 2,3,4n FIGURE 7.10 Indirection Vectors in Row-Major Order for B[1...2,1...3,1...4].This section describes the address calculations for arrays stored as a contiguous block in row-major order and as a set of indirection vectors.
Thecalculations for column-major order follow the same basic scheme as thosefor row-major order, with the dimensions reversed. We leave those equationsfor the reader to derive.Row-Major OrderIn row-major order, the address calculation must find the start of the row andthen generate an offset within the row as if it were a vector. Extending thenotation that we used to describe the bounds of a vector, we add subscripts tolow and high that specify a dimension. Thus, low1 refers to the lower boundof the first dimension, and high2 refers to the upper bound of the seconddimension. In our example A[1...2,1...4], low1 is 1 and high2 is 4.To access element A[i,j], the compiler must emit code that computesthe address of row i and follow that with the offset for element j, whichwe know from Section 7.5.1 will be (j − low2 ) × w.
Each row containsfour elements, computed as high2 − low2 + 1, where high2 is the highestnumbered column and low2 is the lowest-numbered column—the upper andlower bounds for the second dimension of A. To simplify the exposition, letlenk = highk − lowk + 1, the length of the kth dimension. Since rows arelaid out consecutively, row i begins at (i − low1 ) × len2 × w from the startof A. This suggests the address computation@A + (i − low1 ) × len2 × w + (j − low2 ) × w364 CHAPTER 7 Code ShapeSubstituting actual values for i, j, low1 , high2 , low2 , and w, we find thatA[2,3] lies at offset(2 − 1) × (4 − 1 + 1) × 4 + (3 − 1) × 4 = 2from A[1,1] (assuming that @A points at A[1,1], at offset 0). Looking at Ain memory, we find that the address of A[1,1] + 24 is, in fact, the addressof A[2,3].04812162024281,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4@AA[2,3]In the vector case, we were able to simplify the calculation when upper andlower bounds were known at compile time.
Applying the same algebra tocreate a false zero in the two-dimensional case produces@A + (i × len2 × w) − (low1 × len2 × w) + (j × w) − (low2 × w), or@A + (i × len2 × w) + (j × w) − (low1 × len2 × w + low2 × w)The last term, (low1 × len2 × w + low2 × w), is independent of i and j, soit can be factored directly into the base address@A0 = @A − (low1 × len2 × w + low2 × w) = @A − 20Now, the array reference is simply@A0 + i × len2 × w + j × wFinally, we can refactor and move the w outside, saving an extraneousmultiply@A0 + (i × len2 + j) × wFor the address of A[2,3], this evaluates to@A0 + (2 × 4 + 3) × 4 = @A0 + 44Since @A0 is just @A − 20, this is equivalent to @A − 20 + 44 = @A + 24,the same location found with the original version of the array addresspolynomial.7.5 Storing and Accessing Arrays 365If we assume that i and j are in ri and rj , and that len2 is a constant, thisform of the polynomial leads to the following code sequence:loadI@A0multIaddmultIloadAOri , len2r1 , rjr2 , 4r@A0 , r3⇒⇒⇒⇒⇒r@A0// adjusted base for Ar1r2r3ra////////i × len2+ jx element length, 4value of A[i,j]In this form, we have reduced the computation to two multiplications andtwo additions (one in the loadAO).
The second multiply can be rewritten asa shift.If the compiler does not have access to the array bounds, it must either compute the false zero at runtime or use the more complex polynomial thatincludes the subtractions that adjust for lower bounds. The former optioncan be profitable if the elements of the array are accessed multiple times ina procedure; computing the false zero on entry to the procedure lets the codeuse the less expensive address computation.
The more complex computationmakes sense only if the array is accessed infrequently.The ideas behind the address computation for arrays with two dimensionsgeneralize to arrays of higher dimension. The address polynomial for anarray stored in column-major order can be derived in a similar fashion.The optimizations that we applied to reduce the cost of address computations apply equally well to the address polynomials for these other kinds ofarrays.Indirection VectorsUsing indirection vectors simplifies the code generated to access an individual element. Since the outermost dimension is stored as a set of vectors,the final step looks like the vector access described in Section 7.5.1.
ForB[i,j,k], the final step computes an offset from k, the outermost dimension’s lower bound, and the length of an element for B. The preliminarysteps derive the starting address for this vector by following the appropriatepointers through the indirection-vector structure.Thus, to access element B[i,j,k] in the array B shown in Figure 7.10, thecompiler uses @B0 , i, and the length of a pointer, to find the vector for thesubarray B[i,*,*]. Next, it uses that result, along with j and the length ofa pointer to find the vector for the subarray B[i,j,*]. Finally, it uses thatbase address in the vector-address computation with k and element length wto find the address of B[i,j,k].366 CHAPTER 7 Code ShapeIf the current values for i,j, and k exist in registers ri ,rj , and rk , respectively, and @B0 is the zero-adjusted address of the first dimension, thenB[i,j,k] can be referenced as follows:loadI @B0⇒ r@B0multI ri , 4⇒ r1loadAO r@B0 , r1 ⇒ r2// false zero of B// assume pointer is 4 bytes// get @B[i,*,*]multI rj , 4loadAO r2 , r3⇒ r3⇒ r4// pointer is 4 bytes// get @B[i,j,*]multI rk , 4loadAO r4 , r5⇒ r5⇒ rb// assume element length is 4// value of B[i,j,k]This code assumes that the pointers in the indirection structure have alreadybeen adjusted to account for nonzero lower bounds.