1994. Compiler Transformations for High-Perforamce Computing, страница 12
Описание файла
PDF-файл из архива "1994. Compiler Transformations for High-Perforamce Computing", который расположен в категории "". Всё это находится в предмете "конструирование компиляторов" из 7 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 12 страницы из PDF
b+7call foo(a)(e) after dead variable eliminationFigure 40.Redundancy6.7.2 Useless-Codeelimination.EliminationUselesscode is often createdby otheroptimizations,likeunreachable-codeelimination.When the compilerdiscoversthat the value being computedby a state-ACMComputingSurveys,Vol. 26, No.
4, December1994384*DavidF. Baconet al.ment is not necessary,it can remove thecode. This can be done for any nonglobalvariablethat is not liue immediatelyafter the definingstatement.Live-variableanalysisis a well-knowndata-flowproblem [Aho et al. 1986]. In Figure 40(c), thevaluescomputedby theassignmentstatementsare no longer used; they havebeen eliminatedin (d).firstoperand[Ardenet al, 1962].example,the control expressioninIf ((a = 1) and (b = 2)) thenc=~end ifis known to be false if a does not equal 1,regardlessof the value of b.
Short circuiting would convert this expressionto:if (not (a = 1)) goto 10if (not (b = 2)) goto 10C=510 continue6.7.3 Dead-Var/ab/e EliminationAfter a series of transformations,particularlyloop optimizations,there are oftenvariableswhose value is never used. Theunnecessaryvariablesare calleddeadvariables;eliminatingthem is a commonoptimization[Aho et al.
1986].In Figure40(d), the variablesC, n, anddebug are no longerused and can beremoved;(e) shows the code afterthevariablesare pruned.Note that if any of the operandsin thebooleanexpressionhaveside-effects,short circuitingcan change the resultsofthe evaluation.The alterationmay ormay not be legal, dependingon the language semantics.Some languagedefinitions, includingC and many dialectsofLisp, address this problemby requiringshort circuitingof boolean expressions.6.8 Procedure6.7.4 Common-SubexpressionEliminationIn many cases, a set of computationswillcontainidenticalsubexpressions.The redundancycan arise in both user code andin address computationsgeneratedby thecompiler.The compilercan computethevalue of the subexpressiononce, store it,and reuse the stored result[Aho et al.1977; 1986; Cocke 1970].
Common-subexpressioneliminationis an importanttransformationand is almost universallyperformed.Whileit is generallya goodidea to performcommon-subexpressioneliminationwhereverpossible,the compiler must considerthe currentregisterpressureand the cost of recomputing.Ifstoringthe temporaryvalue(s)forces adthetransfer.ditionalspills to memory,mationcanactuallydeoptimizetheprogram.6.7.5 Short CircuitingShort circuitingis an optimizationthatcan be performedon boolean expressions.It is based on the observationthat thevalue of many binaryboolean operationscan be determinedfrom the value of theACMComputmgSurveys,VolFor26, No 4, December1994CallTransformationsThe optimizationsdescribedin the nextseveralsectionsattemptto reducetheoverheadof procedurecalls in one of fourways:●●●eeliminatingthe call entirely;eliminatingexecutionof the called procedure’s body;eliminatingsomeof theentry/exitoverhead;andavoidingsome steps in makinga procedurecall whenthe behavi~r;f thecalledprocedureis knownor can bealtered.6.8.1 A Calling Conventionfor S-DLXTo demonstratethe procedurecall optimization,we fh-st define a callingconventionfor S-DLX.Table3 shows howthe registersare used.In general,each calledprocedureisresponsiblefor ensuringthat the valuesinregistersRI 6 – R25 arepreservedacross the call.
The stack begins at thetop of memoryand growsdownward.Thereis no explicitframepointer;instead, the stack pointeris decrementedby the size s of the procedure’sframe atentry and left unchangedduringthe call.CompilerTable 3.NumberUsageROAlwayszero; writesReturnvalue when returningR2. .R7The firstare ignoredfrom a proceduresix words of the argumentsR8.
.R158 caller save registers,R16. .R2510 callee save registers.R26 . . R29ReservedStack pointerReturnaddress duringby calleea procedurecallThe first four floatingpointarguments14 caller save floatingpointregistersF18. .F3114 callee save floatingpointregistersThe valueR30 + s serves as a uirtualframe pointerthat points to the base ofthe stack frame,avoidingthe use of asecond dedicatedregister.For languagesthat cannot predictthe amountof stackspace used duringexecutionof a procedure, an additionalgeneral-purposeregister can be used as a frame pointer,orthe size of the frame can be stored at thetop of the stack (R30 + O).On enteringa procedure,the returnaddress is in R31.
The first six words ofthe procedureargumentsappear in registers R2 – R7, and the rest of the argumentdata is on the stack.Figure41shows the layout of the stack frame for aprocedureinvocation.A similarconventionis followedforfloating-pointregisters,except that onlyfour are reservedfor arguments.Executionof a procedureconsists of sixsteps:for theof registersthat will be(2) The valuesmodifiedduringprocedureexecution(and that must be preservedacrossthe call) are saved on the stack. If theproceduremakes any procedurecallsitself,the saved registersshouldinclude the returnaddress, R31.body is executed.(3) The procedureis stored inwere savedacross a call.systemFO. .F3value (if any)(4) The returnRI, and the registersthatin step 2 are restored.registerscallThese registers must be preservedF4. .F17on the stack(1) Space is allocatedprocedureinvocation.callto the procedureused as temporaryfor use by the operatingR303%5●S-DLX Registers and Their UsageRiR31Transformationsto the procedure(5) The frame(6) ~~~dsCallingcess:callis removedis transferreda procedureisfromthe stack.to thereturna four-steppro-of any of the registers(1) The valuesR1 – RI 5 that containlive values aresaved.If the valuesof any globalvariablesthat mightbe used by thecallee are in a registerand have beenmodified,the copy of those variablesin memoryis updated.are stored in the des(2) The argumentsignatedregistersand, if necessary,on the stack.is made to the target(3) A linked jumpprocedure;the CPU leavesthe address of the next instructionin R31.the saved registersare(4) Upon return,restored,and the registersholdingglobal variablesare reloaded.To demonstratethe structureof a procedure and the calling convention,Figure42 shows a simple functionand its compiledcode.
The function(foo) and thefunctionthat it calls (max) each take twointegerarguments,so they do not need topass argumentson the stack. The stackframe for foo is three words, whichareused to save the returnaddress(R31 )and registerR 16 duringthe call to max,and to hold the local variabled. R31ACM ComputingSurveys,Vol26, No. 4, December1994386*DavidF. Baconet al.high addressescaller’sstackfrarrtearg nar’z 1locals and temporariessavedregisters1frame sizearguments for calledproceduresSp --.+low addresses4stack grows downFigure 41.Stackmust be preservedbecause it is overwritten by the jump-and-link(JAL) instruction; RI 6 must be preservedbecause it isused to hold c across the call.The procedurefirstallocatesthe 12bytes for the stack frame and saves R31and R 16; then the parametersare loadedinto registers.The valueof d is calculated in the temporaryregisterR9.
Thenthe addresses of the argumentsare storedin the argumentregisters,and a jump tomax is made. On returnfrom max, thereturnvalue has been computedinto thereturnvalue register(Rl ). Afterremoving the stack frameand restoringthesaved registers,the procedure jumps backto its caller throughR31.6.8.2Leaf ProcedureOptimizationA leaf procedureis one that does not callany otherprocedures;the name comesfrom the fact that these proceduresareleaves in the static call graph. The simplest optimizationfor leaf proceduresisthat they do not need to save and restoreACMComputmgSurveys,Vol. 26, No.
4, December1994framelayoutthe returnaddress (R31 ). Additionally,ifthe proceduredoes not have any localvariablesallocatedto memory,the compilerdoes not need to createa stackframe.Figure43(a) shows the functionmax(called by the previousexamplefunctionfoo), its originalcompiledcode (b), andthe code after leaf procedureoptimization (c).
Aftereliminatingthe save/restore of R31, there is no need to allocatea stack frame. Eliminatingthe code thatdeallocatesthe framealso allowsthefunctionto returndirectlyif x < y.6.8.3 Cross-Call Register AllocationSeparatecompilationreduces the amountof informationavailableto the compilerabout called procedures.However,whenboth callee and caller are available,thecompilercan take advantageof the register usage of the callee to optimizethecall.If the callee does not need (or can berestrictednot to use) all the temporaryCompilerintegerfunctionfoo(c,b)integarfunctionx,integerc,bintegerintegerd,eif(x> y)e = max(b,max(x,y)ythenelsed)max= e+creturnendendreturn(a)sourcecode for function= Yifendfoo(a)foo:387●max=xd = c+bfooTransformationsSUBIR30,Sw8(R30),#12Sw4(R30),LWR16,LWR8,ADD;adjustR31R16(R2)Sw(R30),R2,R8R16;R9=d=c+b;saveR9R3R3,JALmaxR30,#OADDRI,LWR16,LWR31,8(R30)ADDIR30,#12JRR31;arg2=addr(d);callRl,d;argl=addr(b)ADDIR164(R30)(b) compiledFigure42.max:; R8=bR16,MOV;saveFunctionfoocodefor functionrnaxSPretaddr;R16=C(R3)R9,;eavesourcemax;Rl=e;Rl=e+cR30,Sw(R30),LWR8,(R2)LHR9,(R3)SGTRIO,R8,BEQZR1O,ElseMOVRl,JRetElse:MOVRl,Ret:R31,LWR16ADDI R30,;restoreretaddrJR;restoreSP;restore;adjustSUBI#4retaddr; R8=X; R9=yR9;RIO=(X;X> y)<= yR8; rnax=xR9; max=y(R30);restoreR31#4;restoreSP; returnR31(b)SP;saveR31originalcompiledcodeof max; returncodefor fooanditsmax:compiledcode.LWR8,(R2); R8=XLWR9, (R3)R1O, R8,; R9=ySGTR9 ;R1O=(X > y);X <= yBEQZ R1O, Elseregisters(R8–R15),the caller can leavevalues in the unusedregistersthroughout executionof the callee.