1994. Compiler Transformations for High-Perforamce Computing (798436), страница 3

Файл №798436 1994. Compiler Transformations for High-Perforamce Computing (1994. Compiler Transformations for High-Perforamce Computing) 3 страница1994. Compiler Transformations for High-Perforamce Computing (798436) страница 32019-09-182019-09-18СтудИзба

1994. Compiler Transformations for High-Perforamce Computing

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 3)

Baconthatredundantet al.loadsare0 Qc is an analogousquantitythat measures reuse of a cache line in memory.It is the ratio of the numberof wordsthatare read out of the cache fromthat particularline to the numberoftimes the cache fetches that line frommemory.3.2 Model ArchitecturesIn the Appendixwe presenta series ofmodelarchitecturesthatwe willusethroughoutthis surveyto demonstratethe effect of’ variouscompilertransformations.

The architecturesincludea superscalarCPU(S-DLX),a vectorCPU(V-DLX),a shared-memorymultiprocessor (sMX),anda distributed-memorymultiprocessor(dMX).We assumethatthe readeris familiarwith basic principles of modern computerarchitecture,includingRISC design, pipelining,caching,andinstruction-levelparallelism.Ourgenericarchitecturesare based on DLX,an idealizedRISCarchitectureintroducedbyHennesseyandPatterson[1990].4.COMPILER ORGANIZATIONFigure 2 shows the design of a hypothetical compilerfor a superscalaror vectormachine.It includesmost of the generalpurposetransformationscovered in thissurvey.Becausecompilersfor parallelmachinesare still a very active area ofresearch,we have excludedthese machines from the design.The purposeof this compilerdesign isto give the reader (1) an idea of how thevarioustypes of transformationsfit togetherand (2) some of the orderingissues that arise.

This organizationis byno meansdefinitive:differentarchitectures dictatedifferentdesigns, and opinions differon how best to orderthetransformations.Optimizationtakes place in three distinct phases, correspondingto three differentrepresentationsof the program:high-levelintermediatelanguage,low-ACMComputmgSurveys,Vol26,No4, December1994level intermediatelanguage,and objectcode. First, optimizationsare applied to ahigh-levelintermediatelanguage(HIL)thatis semanticallyvery close to theoriginalsource program.Because all thesemanticinformationof the source m-ogram is available,higher-leveltran~formationsare easier to apply. For instance,array referencesare clearlydistinguishable, insteadof being a sequence of lowlevel address calculations.Next,the programis translatedtolow-levelintermediateform (LIL),essentiallyan abstractmachinelanguage,andoptimizedat the LIL level.

Addresscomputationsprovidea majorsourceofpotentialoptimizationat this level. Forinstance,two referencesto a[5, 3] anda[7, 3] both requirethe computationofthe address of column3 of array a.Finally,the programis translatedtoobjectcode, and machine-specificoptimizationare performed.Some optimization,like code collocation,rely on profileinformationobtainedby runningtheprogram.Theseoptimizationsareoftenimplementedas binary-to-binarytranslations.The high-leveloptimizationphase begins by doing all of the large-scalerestructuringthat will significantlychangethe organizationof the program.This includesvariousprocedure-restructuringoptimlizations,as wellas scalarization,whichconvertsdata-parallelconstructsintoloops.

Doingthis restructuringatthe beginningallows the rest of the compiler to work on individualproceduresinrelativeisolation,Next,high-leveldata-flowoptimization,partialevaluation,and redundancyeliminationare performed.These optimizationsimplifythe programas muchas possible,and remove extraneouscodethatcouldreducethe effectivenessofsubsequentanalysis.The main focus in compilersfor highperformancearchitecturesis loop optimization.isperformedAsequencetoconvertof transformationstheloopsintoathat is more amenableto optimization:wherepossible,perfectloop nestsare created;subscriptexpressionsareformCompilerTransformationsF●353%Source ProgramPerfect Loop Ne.oslnductton VartablesILoop ReorderingLoop IntercharrgaLoop SkewingLoop ReversalLoop TthngLoop CoalescingStrip MiningLoop UnrollingCycle Shnnkma“FTail-recursion EhmmatlonProcedureProcedureCull GraphAnwkmons1tLow-levelHigh-Level Dataflow OptimizationConstant Propagationand FoldingCopy Props ationCommon.%%sxpreswonEliminationLoop-invariant Code MotIonLoop UnswtfchingStranoth ReductionIL GenerationLow-lsvel InferresdlateLanguage(UL)Low-level Dataflow OptimizationConstant Propagation and FoldingCopy PropagationCommon Subaxpression EliminationLoq-mvanantCode MotionStrength Reduction of lnd.Var.

Exprs.ReassociationInductIon Variable EliminationPartial EvaluationAlgebralc SlmplficatronShort-circuitingLow-level&%%%%%X:?duct’OnRedundancyEliminatiorr=Redundancy EliminationUnreachableCode ElimmationUsaleasCede ElimmationDead Variable Eltminat!on! LooIz Preparation IIDepeedsnceVectorsCross-Call Re Ister Allocation=Loop DistributionLoop NormalizationLoop PeelingScalar ExpanaionAssemblv L@muaee1Micro-optimizationPeephok OptimizationSuperoptlmlzatlont1Loop Preparation IIForwardSubstationArray PaddingArray AlignmentIdiom and ReductionRecognitionLOODCollaosmoErectableProgrom1Instruction Ceche OptimizationCode Co-locatiomDisplacement MinimizationCache Conflict AvoidancetFinal OpttmtzedPigure2.Organizationof a hypotheticalACMoptimizingComputmgCo&compiler.Surveys,Vol.26, No.4, December1994354David“F.

Baconet al.rewrittenin terms of the inductionvariables; and so on. Then the loop iterationsare reorderedto maximizeparallelismand locality.A postprocessingphase organizesthe resultingloops to minimizeloop overhead.Afterthe loop optimization,the program is convertedto low-levelintermediate language(LIL). Many of the data-flowand redundancyeliminationoptimizationthat were appliedto the HILarereappliedto the LIL in order to eliminateinefficienciesin the componentexpressions generatedfrom the high-levelconstructs.Additionally,inductionvariableoptimizationare performed,whichareoften importantfor high performanceonuniprocessormachines.A final LIL optimizationpassappliesprocedurecalloptimizations.CodegenerationconvertsLILintoassemblylanguage.Thisphaseof thecompileris responsiblefor instructionselection,instructionscheduling,andregisterallocation.The compilerapplieslow-leveloptimizationsduringthis phaseto improvefurtherthe performanceofthe code.Finally,object code is generatedby theassembler,and the object files are linkedinto an executable.Afterprofiling,profile-basedcache optimizationcan be applied to the executableprogramitself.5, DEPENDENCEANALYSISthe various forms of analysis usedby optimizingcompilers,the one we relyon most heavilyin this survey is dependence analysis[Banerjee1988b;Wolfe1989b].Thissectionintroducesdependence analysis,its terminology,and theunderlyingtheory.A dependenceis a relationshipbetweentwocomputationsthatplacesconstraintson their executionorder.

Dependenceanalysisidentifiesthese constraints,whichare then used to determine whethera particulartransformation can be applied withoutchangingthesemanticsof the computation.Among5.1Typesof DependenceThere are two kinds of dependence:controldependenceanddatadependence.ACMComputingSurveys,Vol.26,No4, December1994Thereis a controldependencebetweenstatement1 and statement2, writtenS1S1 determines4s2, when statementwhetherS’z will be executed.For example:12if (a = 3) thenb=10end ifTwo statementshave a data dependenceif theycannotbe executedsimultaneously due to conflictinguses of the samevariable.Thereare threetypes of data(alsodependence:flowdependencecalledtrue dependence),antidependence,and outputdependence.Sb has a flowdependenceon S~ (denotedby S~ - S’d)when S~ must be executedfirst becauseit writesa value that is read by S’d.

Forexample:34a=c=lod=2*a+cSG has an antidependenceon S~ (denotedby S~ * Sc ) whenSc writesa variablethat is read by S~:56e=f*4+gg=z.hAn antidependencedoes not constrainexecutionas tightlyas a flowdependence. As before,the code willexecutecorrectlyif So is delayeduntilafter S~completes.An alternativesolutionis touse two memorylocationsg~ and g~ tohold the values read in S~ and writteninSG, respectively.If the write by SG completesfirst,the old valuewillstillbeavailablein g~.An output dependenceholds when bothstatementswrite the same variable:78a=b*ca=di-eWe denote this conditionby writingST~ Sg. Again, as with an antidependence,storagereplicationcan allow the statementsto executeconcurrently.In thisshortexamplethereis no interveninguse of a and no controltransferbetweenthe two assignments,so the computationin ST is redundantand can actuallybeeliminated.CompilerThe fourthpossible relationship,calledan input dependence,holds when two accesses to the same memorylocationarebothreads.Althoughan inputdependence imposesno orderingconstraints,the compilercan make note of it for thepurposesof optimizingdata placementon multiprocessors.We denote an unspecifiedtype of dependenceby SI = S2.

Характеристики

Тип файла

PDF-файл

Размер

6,02 Mb

Материал

1994. Compiler Transformations for High-Perforamce Computing

Тип материала

Статья

Предмет

Конструирование компиляторов

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов статьи

1994.-compiler-transformations-for-high-perforamce-computing.rar

1994. Compiler Transformations for High-Perforamce Computing.pdf

Прочти меня!!!.txt

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.