A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition) (798439), страница 43
Текст из файла (страница 43)
Ifthe degree of a neighbor is already less than K − 1, then the neighbor must be move-related,and is not added to the simplifyWorklist. When the degree of a neighbor transitions from Kto K − 1, moves associated with its neighbors may be enabled.procedure DecrementDegree(m)let d = degree[m]degree[m] ← d-1if d = K thenEnableMoves({m} ∪ Adjacent(m))spillWorklist ← spillWorklist / {m}if MoveRelated(m) thenfreezeWorklist ← freezeWorklist ∪ {m}elsesimplifyWorklist ← simplifyWorklist ∪ {m}procedure EnableMoves(nodes)forall n ∈ nodesforall m ∈ NodeMoves(n)if m ∈ activeMoves thenactiveMoves ← activeMoves / {m}worklistMoves worklistMoves ∪ {m}Only moves in the worklistMoves are considered in the coalesce phase. When a move iscoalesced, it may no longer be move-related and can be added to the simplify work list by theprocedure AddWorkList.
OK implements the heuristic used for coalescing a precoloredregister. Conservative implements the conservative coalescing heuristic.procedure AddWorkList(u)if (u ≠ precolored ∧ not(MoveRelated(u)) ∧ degree[u] < K) thenfreezeWorklist ← freezeWorklist / {u}simplifyWorklist ← simplifyWorklist ∪ {u}function OK(t, r)degree[t] < K ∩ t ∈ precolored ∩ (t, r) ∈ adjSet202function Conservative(nodes)let k = 0forall n ∈ nodesif degree[n] ≥ K then k ← k + 1return (k < K)procedure Coalesce()let m=copy(x, y)) ∈ worklistMovesx ← GetAlias(x)y ← GetAlias(y)if y ∈ precolored thenlet (u, v) = (y, x)elselet (u, v) = (x, y)worklistMoves ← worklistMoves / {m}if (u = v) thencoalescedMoves ← coalescedMoves ∪ {m}AddWorkList(u)else if v ∈ precolored ∩ (u, v) ∈ adjSet thenconstrainedMoves ← constrainedMoves ∪ {m}AddWorkList(u)AddWorkList(v)else if u ∈ precolored ∧ (∀t ∈ Adjacent(v, OK(t, u/)∩u ∉ precolored ∧Conservative(Adjacent(u) ∪ Adjacent(v) thencoalescedMoves ← coalescedMoves ∪ {m}Combine(u, v)AddWorkList(u)elseactiveMoves ← activeMoves ∪ {m}procedure Combine(u, v)if v ∈ freezeWorklist thenfreezeWorklist freezeWorklist / {v}elsespillWorklist ← spillWorklist / {v}coalescedNodes ← coalescedNodes ∪ {v}alias[v] ← umoveList[u] ← moveList[u] ∪ moveList[v]EnableMoves(v)forall t ∈ Adjacent(v)AddEdge(t,u)DecrementDegree(t)if degree[u] ≥ K ∧ u ∈ freezeWorkListfreezeWorkList ← freezeWorkList / {u}spillWorkList ← spillWorkList ∪ {u}function GetAlias (n)if n ∈ coalescedNodes thenGetAlias(alias[n])else nprocedure Freeze()let u ∈ freezeWorklistfreezeWorklist ← freezeWorklist / {u}simplifyWorklist ← simplifyWorklist ∪ {u}203FreezeMoves(u)procedure FreezeMoves(u)forall m(=copy(x, y)) ∈ NodeMoves(u)if GetAlias(y)=GetAlias(u) thenv ← GetAlias(x)elsev ← GetAlias(y)activeMoves ← activeMoves / {m}frozenMoves ← frozenMoves ∪ {m}if v ∈ freezeWorklist ∧ NodeMoves(v) = {} thenfreezeWorklist ← freezeWorklist / {v}simplifyWorklist simplifyWorklist ∪ {v}procedure SelectSpill()let m ∈ spillWorklist selected using favorite heuristicNote: avoid choosing nodes that are the tiny live rangesresulting from the fetches of previously spilled registersspillWorklist ← spillWorklist / {m}simplifyWorklist ← simplifyWorklist ∪ {m}FreezeMoves(m)procedure AssignColors()while SelectStack not emptylet n = pop(SelectStack)okColors ← {0,...,K-1}forall w ∈ adjList[n]if GetAlias(w) ∈ (coloredNodes ∪ precolored) thenokColors ← okColors / {color[GetAlias(w)]}if okColors Dfg thenspilledNodes ← spilledNodes ∪ {n}elsecoloredNodes ← coloredNodes ∪ {n}let c ∈ okColorscolor[n] cforall n ∈ coalescedNodescolor[n] ← color[GetAlias(n)]procedure RewriteProgram()Allocate memory locations for each v ∈ spilledNodes,Create a new temporary vi for each definition and each use,In the program (instructions), insert a store after eachdefinition of a vi, a fetch before each use of a vi.Put all the vi into a set newTemps.spilledNodes ← {}initial ← coloredNodes ∪ coalescedNodes ∪ newTempscoloredNodes ← {}coalescedNodes ← {}We show a variant of the algorithm in which all coalesces are discarded if the program mustbe rewritten to incorporate spill fetches and stores.
For a faster algorithm, keep all thecoalesces found before the first call to SelectSpill and rewrite the program to eliminate thecoalesced move instructions and temporaries.204In principle, a heuristic could be used to select the freeze node; the Freeze shown above picksan arbitrary node from the freeze work list. But freezes are not common, and a selectionheuristic is unlikely to make a significant difference.11.5 REGISTER ALLOCATION FOR TREESRegister allocation for expression trees is much simpler than for arbitrary flow graphs. We donot need global dataflow analysis or interference graphs.
Suppose we have a tiled tree such asin Figure 9.2a. This tree has two trivial tiles, the TEMP nodes fp and i, which we assume arealready in registers rfp and ri . We wish to label the roots of the nontrivial tiles (the onescorresponding to instructions, i.e., 2, 4, 5, 6, 8) with registers from the list r1, r2,…, rk.Algorithm 11.9 traverses the tree in postorder, assigning a register to the root of each tile.With n initialized to zero, this algorithm applied to the root (tile 9) produces the allocation{tile2 ↦ r1, tile4 ↦ r2, tile5 ↦ r2, tile6 ↦ r1, tile8 ↦ r2, tile9 ↦ r1}.
The algorithm can becombined with Maximal Munch, since both algorithms are doing the same bottom-uptraversal.ALGORITHM 11.9: Simple register allocation on trees.function SimpleAlloc(t)for each nontrivial tile u that is a child of tSimpleAlloc(u)for each nontrivial tile u that is a child of tn ← n - 1n ← n + 1assign rn to hold the value at the root of tBut this algorithm will not always lead to an optimal allocation. Consider the following tree,where each tile is shown as a single node:The SimpleAlloc function will use three registers for this expression (as shown at left on thenext page), but by reordering the instructions we can do the computation using only tworegisters (as shown at right):r1 ← M[a] r1 ← M[b]r2 ← M[b] r2 ← M[c]r3 ← M[c] r1 ← r1 × r2r2 ← r2 × r3 r2 ← M[a]r1 ← r1 + r2 r1 ← r2 + r1Using dynamic programming, we can find the optimal ordering for the instructions.
The ideais to label each tile with the number of registers it needs during its evaluation. Suppose a tile t205has two nontrivial children uleft and uright that require n and m registers, respectively, for theirevaluation. If we evaluate uleft first, and hold its result in one register while we evaluate uright,then we have needed max(n, 1 + m) registers for the whole expression rooted at t. Conversely,if we evaluate uright first, then we need max(1 + n, m) registers. Clearly, if n > m, we shouldevaluate uleft first, and if n < m, we should evaluate uright first.
If n = m, we will need n + 1registers no matter which subexpression is evaluated first.Algorithm 11.10 labels each tile t with need[t], the number of registers needed to evaluate thesubtree rooted at t. It can be generalized to handle tiles with more than two children. MaximalMunch should identify - but not emit - the tiles, simultaneously with the labeling of Algorithm11.10.
The next pass emits Assem instructions for the tiles; wherever a tile has more than onechild, the subtrees must be emitted in decreasing order of register need.ALGORITHM 11.10: Sethi-Ullman labeling algorithm.function Label(t)for each tile u that is a child of tLabel(u)if t is trivialthen need[t] ← 0else if t has two children, uleft and urightthen if need[uleft] = need[uright]then need[t] ← 1 + need[uleft]else need[t] ← max(1, need[uleft], need[uright])else if t has one child, uthen need[t] ← max(1, need[u]else if t has no childrenthen need[t] ← 1Algorithm 11.10 can profitably be used in a compiler that uses graph-coloring registerallocation.
Emitting the subtrees in decreasing order of need will minimize the number ofsimultaneously live temporaries and reduce the number of spills.In a compiler without graph-coloring register allocation, Algorithm 11.10 is used as a pre-passto Algorithm 11.11, which assigns registers as the trees are emitted and also handles spillingcleanly. This takes care of register allocation for the internal nodes of expression trees;allocating registers for explicit TEMPsofthe Tree language would have to be done in someother way. In general, such a compiler would keep almost all program variables in the stackframe, so there would not be many of these explicit TEMPs to allocate.ALGORITHM 11.11: Sethi-Ullman register allocation for trees.function SethiUllman(t, n)if t has two children, uleft and urightif need[uleft] ≥ K and need[uright] ≥ KSethiUllman(uright, 0)n ← n - 1spill: emit instruction to store reg[uright]SethiUllman(uleft, 0)unspill: reg[uright] ← "r1"; emit instruction to fetch reg[uright]else if need[uleft] ≥ need[uright]SethiUllman(uleft, n)SethiUllman(uright, n + 1)206else need[uleft] < need[uright]SethiUllman(uright, n)SethiUllman(uleft, n)reg[t] ← "rn"emit OPER(instruction[t], reg[t], [ reg[uleft], reg[uright]])else if t has one child, uSethiUllman(u, n)reg[t] ← "rn"emit OPER(instruction[t], reg[t], [reg[u]])else if t is nontrivial but has no childrenreg[t] ← "rn"emit OPER(instruction[t], reg[t], [])else if t is a trivial node TEMP(ri)reg[t] ← "ri"PROGRAM GRAPH COLORINGImplement graph-coloring register allocation as two modules: Color, which does just thegraph coloring itself, and RegAlloc, which manages spilling and calls upon Color as asubroutine.
To keep things simple, do not implement spilling or coalescing; this simplifies thealgorithm considerably.package RegAlloc;public class RegAlloc implements Temp.TempMap {public Assem.InstrList instrs;public String tempMap(Temp temp);public RegAlloc(Frame.Frame f, Assem.InstrList il);}class Color implements TempMap {public TempList spills();public String tempMap(Temp t);public Color(InterferenceGraph ig,TempMap initial,TempList registers);}Given an interference graph, an initial allocation (precoloring) of some temporariesimposed by calling conventions, and a list of colors (registers), color produces anextension of the initial allocation. The resulting allocation assigns all temps used in theflow graph, making use of registers from the registers list.The initial allocation is the frame (which implements a TempMap describing precoloredtemporaries); the registers argument is just the list of all machine registers,Frame.registers (see page 251).
The registers in the initial allocation can also appear inthe registers argument to Color, since it's OK to use them to color other nodes as well.The result of Color is a TempMap (that is, Color implements TempMap) describing theregister allocation, along with a list of spills. The result of RegAlloc - if there were no spills is an identical TempMap, which can be used in final assembly-code emission as an argument toAssem.format.A better Color interface would have a spillCost argument that specifies the spilling cost ofeach temporary. This can be just the number of uses and defs, or better yet, uses and defs207weighted by occurrence in loops and nested loops.