1993. Interprocedural Optimization – Eliminating Unnecessary Recompilation (1157408), страница 2
Текст из файла (страница 2)
Section 3 introduces the threespecific kinds of interproceduraldata-flowinformationaddressed by ourwork. Section 4 describes the general recompilationframeworkand presentsthree instantiationsof the framework.Section 5 proposes a more elaborateinstantiationof the frameworkthat can lead to more precise recompilationtests. Section 6 discusses optimizationsthat directlyuse interproceduralfacts. Section 7 generalizes the work to deal with multipleprocedures in asingle compilationunit and to account for the effects of interproceduraloptimizations.Section 8 addresses the dual of the problem—predictingwhena recompilationmight be desirable to improve the results of optimization.Finally, Section 9 contains a summary and some conclusions.2.
COMPILATIONMODELTo simplify the remainderof this discussion, we will first present a model ofthe compilationsystem. The techniques that we describe in this paper areintended for use in a compiler that attempts both separate compilationandthe collection and use of interproceduraldata-flow information.Such a compiler must be structureddifferentlythan one that supports a traditionalACM Transactionson ProgrammingLanguages and Systems, Vol. 15.
No. 3, July 1993.370.M. Burke and L. TorczonrI“co:%::”””L“e”+DATABASEENTRY TOOLor TOOLSLsourceprogramsJ\/YFig. 1, Our compdationmodelseparatecompilationrequirements:scheme.Thedifferences(1) The compiler must have access to informationa program as it compiles each of them.arisefromtwoprincipalabout all the procedures(2) The compiler must have the ability to “retract”optimizations,fact, in response to changes in the interproceduralinformationused to justify them.inafter thethat wasTogether, these observationssuggest a new relationshipbetween compiler,source code, and programmer,depicted in Figure 1.The entry tool provides the programmerwith a means to create and modifysource text that resides in the system’s database.
The entry tool can beimplementedas a language sensitive editor or some combinationof editor,version control system, and compiler front-end. A similar path must allow theprogrammerto constructa representationof the program-arecipe thatspecifies how to bind together individualsource code modules to form a singleexecutable program. The recompilationtool creates an executable image forthis program. It uses informationfrom the database to determine what mustbe compiled and uses the compiler and linker as needed. The compiler hasdirect access to the database for interproceduralinformation,as well as theresults of previous compilations.We assume that the compiler translates onlyone procedure at a time; in Section 7, we show how to extend the recompilation tests to larger compilationunits.The techniques described in this paper have been designed to operate in asystem structuredlike our model.
The model is not overly restrictive.It canaccommodate an implementationin the context of a programmingenvironment—theR‘ programmingenvironmentfor FORTRAN is an example [10].ACM TransactIonson ProgrammingLanguages and Systems, Vol. 15, NO 3, July 1993InterproceduralOptimization:.371Similarly,an implementationwithin a more conventionallystructuredpiler can fit the model—thePTRAN analysis system for FORTRANexample [2].
Both of these systems implementrecompilationanalysistechniques described in this paper.comis anusing3. INTERPROCEDURALEliminatingUnnecessaryRecompilation.INFORMATIONFamiliaritywith interproceduraldata-flow informationis a prerequisitetounderstandingthe recompilationtests, so we begin with some background.Interproceduralinformationprovides the compiler with knowledge about therun-timeconditions under which a procedure will actually be invoked andabout the impact of executingother procedureson the run-timevaluesof variables in the procedure being compiled. We are concerned with threedistinct interproceduralphenomena: aliasing, side effects, and constant propagation.Aliasingoccurs when two or more names, at some point in a program, referto the same storage location.
Because an assignment actually modifies boththe name and all of its aliases, the compilerneeds reasonablypreciseinformationabout aliases.1 We limit our considerationto the interproceduralparametermechanism. For examaliases generated by the call-by-referenceple, when a variableu is passed by reference at a call site to a formalparameterx, u and x become aliases of each other if v is also visible insidealiases in procedure p ifthe called subroutine.Two variables are potentialthey refer to the same storage location in some execution instances of p (i.e.,the variables are aliased along some, but not necessarily all, execution pathsleading to p).
The compiler can compute, for each procedure p, a set ALIAS( p )containing those pairs of names that are potentiallyaliased in p [5A, 14]. Inthe absence of such information,the compiler must assume that all formalparametersand global variablesare potentiallyaliased. In practice, thiseliminates opportunitiesfor optimizationsinvolving those variables.Side-effectsummaryinformationdescribes the effects of executing a procedure call on the values of variables. At a call site, executing the body of thecalled procedure can both reference and change the values of individualvariables. Since the compiler relies on derived knowledge about the values ofvariablesto determinethe safety and profitabilityof optimizations,theimpact of a procedure call on the values of variables in the calling proceduremust be considered. The compiler uses this informationto sharpen its analysis withina single procedure.
In the absence of precise information,thecompiler must assume that the call both modifies and uses every variableavailable to it. Using such worst case assumptionsdecreases the accuracy ofthe data-flowinformationcomputed for the calling procedure,potentiallyinhibitingoptimizationwithin that procedure.1Strictly speaking, the FORTRAN standard permits the compiler to ignore aliasing. Thestandard contains a restriction that neither of the two aliases may be modified in a standardconformingprogram [5]. Nevertheless,many compilers attempt to trace aliasesbecauseinformation about potential aliasescan be useful as a diagnostic aid to the programmer and becausetheresulting systemsachievea higher level of predictability than the standard requires.ACM Transactionson ProgrammingLanguages and Systems, Vol.
15, No 3, July 1993.372.M, Burke and L. TorczonIn our model system, the compiler will annotateeach call site e in aprogram with two sets, MOD(e) and REF( e).2 The former contains all variablesthat might be modified as the result of executing e, while the latter containsall those variables whose values might be referenced as the result of executing e. For example, in traditionalavailable expression analysis, a procedurecall must be assumed to kill every expressioninvolvingeither a globalvariable or an actual parameter.
But if the compiler encounters an expressionv that is available immediatelybefore call site e and it determines that noneof the constituentvariables of v are in MOD(e), then it can assume safely thatu is still available after e.In large programs, informationis often passed between procedures in theform of constant-valuedactual parameters or global variables. This is particularly common in numericalprograms that incorporatemodules from standard libraries such as LINPACK[ 19], and in programs where the dimensionsof major data structures are stored in variables to simplify later modification.Interproceduralconstant propagationattempts to identify formal parametersand global variables that will have the same known constant value on eachinvocation of a procedure within a given program. Finding a precise solutionto the general constant propagationproblem is undecidable[22] and theusual approximateconstant propagationproblem is intractablein an interprocedural setting [23].
However, a useful subset of the complete and preciseset of interproceduralconstants can still be profitable for the optimizer. Thealgorithmsfor this problem proposed to date compute approximationsto thesets of constant-valuedparametersand global variables [7, 9, 32, 34]. In ourmodel system, the compiler computes, for each procedure p in the program, aofset CONSTANTS(P) of constants known to hold on entry to p. ElementsCONSTANTS(p) are pairs of the form (x, u), where x is the name of a formalparameter or global variable and v is its known constant value.As an example, consider the program fragment shown in Figure 2. Assuming that all of the relevant statements are shown, the aliasing and constantssets for its procedures would be:ProcedureALIASb00c{( X,P3)}aGONSTANTS0{(P2,17)]{(P4,17)}.The potential alias for procedure c arises when call site a passes the globalvariable x as an actual parameter.The constants come about from passing2 For consistencywith the rest of the literature on interprocedural data-flow analysis, we willcall this set REF, even though we have used the name USEin the past.
USEappearsin severalsourcesas the set of variables whose values can be read before modification. REF ignores theissue of whether or not a modification intervenes between the call site and the first use m acalled procedure,Thus, the REF set is inherently flow-insensitive, while the USEset is inherentlyflow-sensitme.ACM TransactIonson ProgrammmgLanguages and Systems, Vol. 15, No. 3, July 1993InterproceduralOptimization:program ainteger x,y, z,vl, v2common/global/x,y,z...Eliminatingsubroutine b (pl,integer pl, p2...callc(pl, p2)7:callC(X,Recompdatlon... .endendP:call““” b(vl,...373p3=p4*2.