Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 63
Файл "Cooper_Engineering_a_Compiler(Second Edition)" внутри архива находится в следующих папках: Rice, Купер и Торчсон - перевод. PDF-файл из архива "Rice", который расположен в категории "разное". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 63 страницы из PDF
Using distinctnames exposes those results to analysis and transformation. In practice, mostof the improvement that compilers achieve in optimization arises from capitalizing on context. To make that improvement possible, the ir must exposethe context. Naming can hide context, as when it reuses one name for manydistinct values. It can also expose context, as when it creates a correspondence between names and values. This issue is not specifically a propertyof linear codes; the compiler could use a lower-level ast that exposed theentire address computation.5.4.2 Static Single-Assignment FormSSA forman IR that has a value-based name system,created by renaming and use ofpseudo-operations called φ-functionsSSA encodes both control and value flow.
It isused widely in optimization (see Section 9.3).φ-functionA φ-function takes several names and mergesthem, defining a new name.Static single-assignment form (ssa) is a naming discipline that many moderncompilers use to encode information about both the flow of control and theflow of data values in the program. In ssa form, names correspond uniquelyto specific definition points in the code; each name is defined by one operation, hence the name static single assignment. As a corollary, each use ofa name as an argument in some operation encodes information about wherethe value originated; the textual name refers to a specific definition point.
Toreconcile this single-assignment naming discipline with the effects of control flow, ssa form inserts special operations, called φ-functions, at pointswhere control-flow paths meet.A program is in ssa form when it meets two constraints: (1) each definitionhas a distinct name; and (2) each use refers to a single definition. To transform an ir program to ssa form, the compiler inserts φ-functions at pointswhere different control-flow paths merge and it then renames variables tomake the single-assignment property hold.To clarify the impact of these rules, consider the small loop shown on theleft side of Figure 5.9. The right column shows the same code in ssa form.Variable names include subscripts to create a distinct name for each definition.
φ-functions have been inserted at points where multiple distinct valuescan reach the start of a block. Finally, the while construct has been rewrittenwith two distinct tests, to reflect the fact that the initial test refers to x0 whilethe end-of-loop test refers to x2 .The φ-function’s behavior depends on context. It defines its target ssa namewith the value of its argument that corresponds to the edge along which5.4 Mapping Values to Names 247x0 ← · · ·y0 ← · · ·x ← ···y ← ···while(x < 100)x ← x + 1y ← y + xif (x0 ≥ 100) goto nextloop: x1 ← φ(x0 ,x2 )y1 ← φ(y0 ,y2 )x2 ← x1 + 1y2 ← y1 + x2if (x2 < 100) goto loopnext: x3 ← φ(x0 ,x2 )y3 ← φ(y0 ,y2 )(a) Original Code(b) Code in SSA Formn FIGURE 5.9 A Small Loop in SSA Form.control entered the block.
Thus, when control flows into the loop from theblock above the loop, the φ-functions at the top of the loop body copy thevalues of x0 and y0 into x1 and y1 , respectively. When control flows intothe loop from the test at the loop’s bottom, the φ-functions select their otherarguments, x2 and y2 .On entry to a basic block, all of its φ-functions execute concurrently, beforeany other statement. First, they all read the values of the appropriate arguments, then they all define their target ssa names. Defining their behavior inthis way allows the algorithms that manipulate ssa form to ignore the ordering of φ-functions at the top of a block—an important simplification.
It cancomplicate the process of translating ssa form back into executable code, aswe shall see in Section 9.3.5.ssa form was intended for code optimization. The placement of φ-functionsin ssa form encodes information about both the creation of values and theiruses. The single-assignment property of the name space allows the compiler to sidestep many issues related to the lifetimes of values; for example,because names are never redefined or killed, the value of a name is available along any path that proceeds from that operation.
These two propertiessimplify and improve many optimization techniques.The example exposes some oddities of ssa form that bear explanation. Consider the φ-function that defines x1 . Its first argument, x0 , is defined in theblock that precedes the loop. Its second argument, x2 , is defined later in theblock labelled loop. Thus, when the φ first executes, one of its argumentsis undefined. In many programming-language contexts, this would causeproblems. Since the φ-function reads only one argument, and that argument248 CHAPTER 5 Intermediate RepresentationsTHE IMPACT OF NAMINGIn the late 1980s, we experimented with naming schemes in a FORTRANcompiler.
The first version generated a new temporary register for eachcomputation by bumping a simple counter. It produced large name spaces,for example, 985 names for a 210-line implementation of the singular valuedecomposition (SVD). The name space seemed large for the program size.It caused speed and space problems in the register allocator, where thesize of the name space governs the size of many data structures. (Today,we have better data structures and faster machines with more memory.)The second version used an allocate/free protocol to manage names. Thefront end allocated temporaries on demand and freed them when theimmediate uses were finished.
This scheme used fewer names; for example,SVD used roughly 60 names. It sped up allocation, reducing, for example,the time to find live variables in SVD by 60 percent.Unfortunately, associating multiple expressions with a single temporaryname obscured the flow of data and degraded the quality of optimization.The decline in code quality overshadowed any compile-time benefits.Further experimentation led to a short set of rules that yielded strongoptimization while mitigating growth in the name space.1.
Each textual expression received a unique name, determined byentering the operator and operands into a hash table. Thus, eachoccurrence of an expression, for example, r17 +r21 , targeted thesame register.2. In hopi ri , rj ⇒ rk , k was chosen so that i,j < k.3. Register copy operations (i2i ri ⇒ rj in ILOC) were allowed to havei > j only if rj corresponded to a scalar program variable. The registersfor such variables were only defined by copy operations. Expressionsevaluated into their "natural" register and then were moved into theregister for the variable.4. Each store operation (store ri ⇒ rj in ILOC) is followed by a copyfrom ri into the variable’s named register.
(Rule 1 ensures that loadsfrom that location always target the same register. Rule 4 ensures thatthe virtual register and memory location contain the same value.)This name-space scheme used about 90 names for SVD, but exposed all theoptimizations found with the first name-space scheme. The compiler usedthese rules until we adopted SSA form, with its discipline for names.corresponds to the most recently taken edge in the cfg, it can never read theundefined value.φ-functions do not conform to a three-address model. A φ-function takesan arbitrary number of operands. To fit ssa form into a three-address ir, the5.4 Mapping Values to Names 249BUILDING SSAStatic single-assignment form is the only IR we describe that does not havean obvious construction algorithm.
Section 9.3 presents the algorithm indetail. However, a sketch of the construction process will clarify some ofthe issues. Assume that the input program is already in ILOC form. Toconvert it to an equivalent linear form of SSA, the compiler must first insertφ-functions and then rename the ILOC virtual registers.The simplest way to insert φ-functions adds one for each ILOC virtual register at the start of each basic block that has more than one predecessorin the control-flow graph. This inserts many unneeded φ-functions; mostof the complexity in the full algorithm is aimed at reducing the number ofextraneous φ-functions.To rename the ILOC virtual registers, the compiler can process the blocks,in a depth-first order. For each virtual register, it keeps a counter.
Whenthe compiler encounters a definition of ri , it increments the counter forri , say to k, and rewrites the definition with the name ri . As the compilerktraverses the block, it rewrites each use of ri with rik until it encountersanother definition of ri . (That definition bumps the counter to k + 1.) Atthe end of a block, the compiler looks down each control-flow edge andrewrites the appropriate φ-function parameter for ri in each block thathas multiple predecessors.After renaming, the code conforms to the two rules of SSA form. Eachdefinition creates a unique name.
Each use refers to a single definition. Several better SSA construction algorithms exist; they insert fewer φ-functionsthan this simple approach.compiler writer must include a mechanism for representing operations withlonger operand lists. Consider the block at the end of a case statement asshown in the margin.The φ-function for x17 must have an argument for each case. A φ-operationhas one argument for each entering control-flow path; thus, it does not fitinto the fixed-arity, three-address scheme.In a simple array representation for three-address code, the compiler writermust either use multiple slots for each φ-operation or use a side data structureto hold the φ-operations’ arguments.
In the other two schemes for implementing three-address code shown in Figure 5.5, the compiler can inserttuples of varying size. For example, the tuples for load and load immediatemight have space for just two names, while the tuple for a φ-operation couldbe large enough to accommodate all its operands.switch on y0x1← ... x2 ← ...x15 ← ... x16 ← ...x17 ←φ (...)250 CHAPTER 5 Intermediate Representations5.4.3 Memory ModelsJust as the mechanism for naming temporary values affects the information that can be represented in an ir version of a program, so, too, does thecompiler’s choice of a storage location for each value.