1993. Interprocedural Optimization – Eliminating Unnecessary Recompilation (1157408), страница 8
Текст из файла (страница 8)
Hence, only the call sites between el and ez need to be recorded to insuresafety. Stated in more general terms, for a fact in both X and Y, its presencein Y is irrelevantbecause the occurrence in X happens on the path to theoccurrence in Y. If the operators in the standard data-flow equations weremore descriptive, we could merely state that we compute the a calls set for aACM Transactions on Programming Languages and Systems, Vol. 15, No 3, July 1993.390.M, Burke and L.
Torczonfact at any point by doing a natural union of the a calls sets from all of thepaths contributingto that fact.The table in Figure 3 summarizesthe data-flow informationused in ourexample optimizations.Global common subexpressioneliminationand codehoisting clearly depend on all-paths information.The AVAIL and VERYBUSYinformationthat we compute for these optimizationsis all-paths information.Even optimizationsthat, at first glance, appear to depend on any-pathinformationactually depend on all-paths information.Global constant propagation uses REACH information.REACH is any-path information,but globalconstant propagationdepends on an augmented form of this information.Itcomputes all-paths information—thedefinitionreaches point p with knownconstant value c.
For a constant c to be folded at p, the same constant valuec must reach p along all paths through the procedure leading to p. Finally,register store eliminationis usually based on any-pathLIVE information.However, the informationthat is actually used by this optimizationis theall-pathsLIVE informationdiscussed in Section 5.3. This is true of otheroptimizations,like dead-code elimination,that are based on the converse ofany-paths information.Our examples illustratethat optimizationsbased on global data-flow information are either based on all-paths information(type I) or the converse ofany-paths information(type II).
In either case, the informationactually usedby the optimizationis all-pathsinformation.This follows from the simpleobservation that, along all paths through the program, the optimizationmustpreserve the program’s semantics. Thus, the correctness of the optimizationisbased on the behavior of the program along all paths (and, therefore, on themeet-over-all-pathssolution for some data-flow problem) [26, 29].The informationthat we compute for CALLSBETWEEN is any-path information because optimizationsare based on all-paths information.That is, if anevent along any path to the optimizationis invalidated,the optimizationitself is invalidatedbecause it relied on all-paths information.The any-pathinformationthat we compute for recompilationanalysis leads to a precise testfor recompilationdue to changes in interproceduralinformationbecause itallows us to detect if any path between an event and an optimizationthatdepends upon that event is broken.
Since optimizationsrely on the fact thatnone of these paths are broken, we know that recompilationis necessary ifany of the paths are broken.g5.5 ComplexityAdding the computation for CALLSBETWEEN informationto the global data-flowanalysis phase increases the time and space that are required to compute the—9Unless, of course, the editing change to the program makes no real difference m the valuesbeing passedaround. Consider adding an assignment to someprocedurethat enlarges the MOI)set but doesnot changethe values of any variable on return from the procedure.
If we assign avariable its known constant value, we really do not invalidatethe applicationof a constant fold,but the MoD-based test will dictate recompilation.This is another example of the limit ofprecision—theanalog of the “up to symbolic evaluation”condition that Barth gave for summaryinformation.ACM TransactIonson ProgrammmgLanguages and Systems, Vol.
15, No. 3, July 1993InterproceduralOptimization:EliminatingUnnecessaryRecompilation..391global data-flow informationby a factor of 0(p) where p is the number ofcall sites in the procedure. The additionalspace is used to store, with everydata-flowfact in every basic block in the program,the set of call sitesassociated with that fact. If the set of call sites is stored as a bit vector, eachset requires a bit vector of length p. In effect, we have a k by p bit matrix,where k is the number of data-flow facts.Additionaltime is needed to update the set of call sites associated with thedata-flowfacts.
To update the call sites informationduring the data-flowcomputation,we compute, for each call site in the bit matrix, those facts thatrely on interproceduralinformationprovided by that call site. This computation requires a constant number of bit vector operationson bit vectors oflength k for each of the p call sites in the procedure. Hence, the timerequired to compute global data-flow informationis 0( p17d(G)) for reduciblefor nonreduciblegraphs, where E is the number ofgraphs and 0( PEN)edges, N is the number of basic blocks in the flow graph of the procedure,and d(G) is the loop-connectednessof the graph as defined by Kam andUnman [21].Since nested procedures occur inside a single compilationunit, an optimization that saves both time and space is possible.
A clever implementationcan capitalize on the fact that variables not visible to the calling procedureneed not be representedin the CALLSBETWEEN set. This is safe becausechanging the visibilityof variables inside a procedure requires an editingchange—an act that mandates its recompilation.5.6 GeneralizationExaminingour four sample optimizationsleads to the followinggeneralalgorithmfor constructingprecise annotationsets. The compiler assigns theannotationsets values that would never mandate recompilationand thenadjusts the sets to reflect each transformation,as applied.
The sets get thefollowing initial values:(1)= ALLVARS x ALLVARS,MayBeAlias(p)(2) MayMod(e)= ALLVARS, for each call site e in p,(3) MczyRe~(e) = ALLVARS, for each call site e in p, and(4) MustBeConstant(p) = 0.Whenever an interproceduralfact is used to justify the safety of an optimizaMayMod,tion, the appropriateset is adjusted, subtractingfrom MayBeAlias,or MayRef,or adding to MustBeConstant.The constructionof MustBeConstantwas described in Section 5. By considand MayReffor the four example optiering the computationof MayModMayModmizations,we can develop a general strategy toward computingand MayRefsets with respect to optimizationbased on global data-flowinformation.We distinguishbetween two respects in which an additionto MOD canchange global data-flowinformation.First, it contributesa new definitionthat reaches certain points in the program.
This adds definitionsto REACHACM Transactionson ProgrammingLanguages and Systems, Vol. 15, No 3, July 1993.392.M Burke and L. Torczonsets and can affect all-paths informationthat is related to REACH information.MayModsets for global constant propagationOur discussion of updatingillustratesthe general strategyfor accommodatingthis kind of impact.Second, it can affect the reaching, exposure, and availabilitycharacteristicsofother definitions,uses, and expressions, respectively (i.e., it can kill them).
Inthe same manner that MOD definitionspreserve the REACH characteristicsofother definitions,they preserve any-pathglobal data-flowinformationingeneral. Thus, this latter impact is only importantwith respect to all-pathsMayModsets for common subexinformation.Our discussion of updatingpression eliminationand code hoisting illustratesthe accommodationof thiskind of impact.As with MOD and REF information,ALIAS informationis factored into globaldata-flow information.To determine when a procedure p needs to be recompiled due to changes in ALIAS(p), it is necessary to track which facts thecompiler indirectlyused when performingoptimization.This can be accomplished by annotatingdata-flow facts in a manner analogous to the one usedfor MOD and REF information.When a new pair is added to ALIAS(p), definitionpoints for one member ofthe alias pair become definitionpoints for the other member of the pair.Likewise,uses of one member of the alias pair become uses of the othermember of the pair.
Because of this, adding a new pair to ALIAS(p) caninvalidateoptimizationsby effectivelyadding definitionsand uses to theroutine. Optimizationsbased on the type of data-flow informationdescribedin Section 5.2 can be invalidatedby adding definitions;optimizationsbasedon the type of data-flow informationdescribed in Section 5.3 can be invalidated by adding uses.To precisely determine if an alias pair should not appear in MayBeAlias(p),the compiler computes either DEFINEDBETWEEN or USEDBETWEEN information for each data-flow fact that involves either a global variable or a formalparameter.DEFINEDBETWEEN ( a, b) contains those global variables and formal parametersdefined on paths contributingto the correctness of data-flowfact a in basic block b; DEFINEDBETWEEN is computed when the data-flowinformationdescribed in Section 5.2 is employed.
USEDBETWEEN ( a, b) contains those global variables and formal parametersused on paths contributing to the correctness of data-flow fact a in basic block b; USEDBETWEEN iscomputed when the data-flowinformationdescribed in Section 5.3 is employed. Given the DEFINEDBETWEEN or USEDBETWEEN informationfor aparticulardata-flow fact, the compiler can determine whether or not an aliasp ) as follows:pair can safely appear in MayBeAlias((1) Initially,let MayBeAlias(p) = ALLVARS x ALLVARS, the set of all pairs ofglobal variables and formal parameters in p.(2) Whenever the compiler uses data-flowp) all alias pairsfrom MayBeAlias(fact a in block b, it must removeconsistingof a global variableorformal parameterreferenced in a and a variable in either DEFINEDBE TWEEN( a, b) or USEDBETWEEN ( a, b).ACM Transactionson ProgrammingLanguages and Systems, Vol.
15, No 3. July 1993InterproceduralOptimization:EllmlnatingUnnecessaryRecompilation..393Since the paths that lead to the correctness of a are the same as the pathsthe approach used forinvolvedin the CALLSBEWTEEN( a, b) computation,computing CALLSBETWEEN informationcan be used to compute DEFINEDBE TWEEN and USEDBETWEEN information.The definitionsfor the operatorspresented in Section 5.1 are used for the computation;geiz[ b] and nkill[ b]are computed as described for CALLSBETWEEN information,except that definitions and uses of global variables and formal parameters are tracked insteadof call sites.This section showed an approach for computing more precise recompilationinformationfor changes in CONSTANTS,MOD, REF, and ALIAS sets.