MIT 22 Data ow analysis (798428), страница 2
Текст из файла (страница 2)
Such a set might look like {x1 = y1 , x2 = y2 , . . . , xn = yn }. In addition, the y 0 s arevariables that were assigned their values earlier than the corresponding x0 s. We can use a set of equalitieslike this to determine whether two variables are definitely known to be equal, which is the informationcopy propagation needs.14.3Dataflow equationsAn equation only holds on entry to a node if it holds on exit from all predecessor nodes.
Therefore, we havethe following equation:\in(n) =out(n0 )n0 ≺nAn equation holds on exit from a node n if either it is established by the node (we use gen(n) to representthe equalities introduced by node n), or if the equation held on entry to the node, but the node makes theequation untrue (we use kill (n) to represent such equalities).
The equation for out(n) is therefore:out(n) = gen(n) ∪ (in(n) − kill (n))The functions gen() and kill () are defined by the following table:1 A better but more complicated representation is to keep track of the equivalence classes of variables, which allows more equalitiesto be proved. This can be done by ensuring that none of the variables yi appears on the LHS of an equality. Then each variable yistands for the equivalence class of variables that are known to be copies; path compression is used to support efficient updating andtesting of equality. This approach is equivalent to value numbering.4nx=yx = ewhere e 6= y[e1 ] = e2if eSTARTEXIT4.4gen(n){x = y}∅∅∅∅∅ or rvkill (n)x = z, z = x for all zx = z, z = x∅∅∅∅Worklist algorithmWe can see that available copies is a forward analysis because the value at entry to a node depends on thepredecessor nodes.
We update the worklist algorithm given earlier by using predecessors where it usedsuccessors, and vice versa, and by using out(n) where it used in(n):1. Initialize the worklist to contain all nodes.2. Initialize the value of out(n) to some initial value (for available copies variable analysis, the set of allpossible equalities).3. While the worklist contains some node n:• Remove n from the worklist.• Set the value of out(n) using the dataflow equation:\in(n) :=out(n0 )n0 ≺nout(n) := gen(n) ∪ (in(n) − kill (n))• Push all successors of n onto the worklist.This analysis has the same monotonicity property as live variable analysis, but the space of possiblevalues is larger. Therefore the algorithm terminates but can take asymptotically longer.5 Dataflow analysis frameworkWe can characterize both of these analyses within a common framework.
This has the advantage that itwill allow us to quickly decide whether a given analysis is guaranteed to complete, what its complexity is,and whether it always computes the best solution possible. In addition, a common framework for dataflowanalysis enables us to implement a general algorithm for doing analyses in the compiler, rather than reimplementing each analysis from scratch.A dataflow analysis framework has four components: the direction of analysis (forward or backward),the values being propagated, transfer functions for each of the nodes, and a meet operator.5.1Dataflow valuesA key component of a dataflow analysis framework is the set of values that the analysis is computing on.In live variable analysis, the values were themselves sets of variables. In available copies, the values weresets of equalities.
Let L be the set of all values that can be assigned to a program point. We will use ` todenote a single value contained in L.What do dataflow values mean? We can usefully think of them as representing some proposition thatmust hold at the program point they are attached to. (We can also think of them as representing propositionsthat may hold at the program point, but this is just the same thing as saying the negation of the proposition5l1ll2l3l1l2l3⊓liF(l)⊓ liTransfer functionF(⊓li)Meet operatorA node in a CFGFigure 2: Dataflow analysis framework componentsmust hold.) For example, in live variable analysis, the meaning of the set of variables attached to a programpoint is that all the variables in the set must be live at that point.In the space of values there is one value that conveys the greatest possible information.
We will denotethis value by the symbol >. In live variable analysis, the greatest information is conveyed by the empty set∅. It means no variables are live, and therefore that the entire program is dead code.5.2Transfer functionsThe second part of a dataflow analysis is a set of transfer functions that describe how dataflow values aretransformed by a node. For a forward analysis, the transfer function defines how out(n) depends on in(n).For a backward analysis, it’s the reverse.5.3Meet operatorThe final part of a dataflow analysis which defines how to combine values from multiple incoming edges.As depicted in the figure, supposing we have propositions corresponding to l1 , l2 , and l3 on various edges.At a common program point where all these edges meet, we don’t know which edge we came in on, so atmost we know the disjunction (or) of the three propositions.
The meet operator u defines how to construct thevalue corresponding to this disjunction. For live variable analysis, the meet operator is union; for availablecopies analysis, it is intersection.5.4Worklist algorithmWe can now see that both versions of the worklist algorithm seen so far are just instances of a more generalalgorithm. We are trying to solve for the dataflow value for each node n.
Without loss of generality, wewill consider forward analysis (for backward analysis, just turn all the arrows around). When we start theworklist algorithm, we initialize out(n) to > for each n. The dataflow equation that is applied to updateout(n) at each iteration is out(n) = F (un0 ≺n out(n0 )).6 SummaryWe’ve seen that a dataflow analysis framework can be characterized as a four-tuple (D, L, u, F ): the direction of analysis D, the space of values L, transfer functions Fn , and the meet operator u.
We’re not yetguaranteed that the worklist algorithm works, however.Next time we’ll show that with some reasonable conditions on L, Fn , and u, the worklist algorithm iscorrect and efficient, and often (but not always) computes the best possible answer to the equations.6.