MIT 20 Introduction to Optimization (slides) (798426)
Текст из файла
Optimization• Next topic: how to generate better codethrough optimization.• This course covers the most valuable andstraightforward optimizations – muchmore to learn!CS 4120Introduction to CompilersAndrew MyersCornell University– Other sources:Lecture 20: Introduction to Optimization12 Oct 11• Muchnick has 10 chapters of optimizationtechniques• Cooper and Torczon also cover optimizationCS 4120 Introduction to CompilersHow fast can you go?100001000Goal of optimization• Help programmersdirect source code interpreters– clean, modular, high-level source code– but compile to assembly-code performancetokenized program interpreters (BASIC, Tcl)AST interpreters (CS 3110 RCL, Perl 4)10010• Optimizations are code transformationsbytecode interpreters (Java, Perl 5, OCaml)– can’t change meaning of program to behavior notallowed by source.call-threaded interpreterspointer-threaded interpreters (FORTH)simple code generation (PA4, JIT)1• Different kinds of optimization:register allocation naive assembly codelocal optimizationexpert assembly code global optimization– space optimization: reduce memory use– time optimization: reduce execution time– power optimization: reduce power usage0.1CS 4120 Introduction to Compilers23CS 4120 Introduction to Compilers4Where to optimize?Why do we need optimization?• Programmers may write suboptimal code to make itclearer.• High-level language may make avoiding redundantcomputation inconvenient or impossiblea[i][j] = a[i][j] + 1• Architectural independence.• Modern architectures make it hard to optimize by hand.• Usual goal: improve time performance• Problem: many optimizations trade off space versustime.• Example: loop unrolling replaces a loop body with Ncopies.– Increasing code space slows program down a little, speeds upone loop– Frequently executed code with long loops: space/time tradeoffis generally a win– Infrequently executed code: optimize code space at expense oftime, saving instruction cache space– Complex optimizations may never pay off!• Focus of optimization: program hot spotsCS 4120 Introduction to Compilers5Safety1.
Pick the right algorithms and datastructures: design for locality and fewoperations2. Turn on optimization and profile to figureout program hot spots.3. Evaluate whether design works; if so…4. Tweak source code until optimizer does“the right thing” to machine codewhile (b) {z = y/x; // x, y not assigned in loop…}• Transformation: invariant code out of loop:Preserves meaning?Faster?Three aspects of an optimization:• the code transformation• safety of transformation• performance improvementCS 4120 Introduction to Compilers6Writing fast programs in practice• Possible opportunity for loop-invariant code motion:z = y/x;while (b) {…}CS 4120 Introduction to Compilers– understanding optimizers helps!7CS 4120 Introduction to Compilers8Structure of an optimizationWhen to apply optimization• Optimization is a code transformation• Applied at some stage of compiler (HIR,MIR, LIR)• In general requires some analysis:IR– safety analysis to determine wheretransformation does not change meaning(e.g.
live variable analysis)– cost analysis to determine where it ought tospeed up code (e.g., which variable to spill)CS 4120 Introduction to Compilerst2[bp–4][bp-8]t3t4mov eax, ebxadd eax, [ebp-4]mov ebx, [ebp–8]AbstractAssemblyLIRAssemblyCS 4120 Introduction to Compilers10• Idea: if operands are known at compile time,evaluate at compile time when possible.int x = (2 + 3)*4*y;int x = 5*4*y;int x = 20*y;• Can perform at every stage of compilationcmp eax, ebx– Constant expressions are created by translation andby optimizationNeed to reuse registers aggressively (e.g., ebx)a[2]• Coalesce registers (t3, t4) to eliminate mov’s• May be impossible without spilling some temporaries to stackCS 4120 Introduction to CompilersCanonicalIRInliningSpecializationConstant foldingConstant propagationValue numberingDead code eliminationLoop-invariant code motionCommon sub-expression eliminationStrength reductionConstant folding & propagationBranch prediction/optimizationRegister allocationLoop unrollingCache optimizationPeephole optimizationsConstant folding• Goal: convert abstract assembly (infinite no.
of registers) intoreal assembly (6 registers)t1,t1,t3,t4,t1,MIR9Register allocationmovaddmovmovcmpASTHIR11MEM(MEM(a) + 2*4)MEM(MEM(a) + 8)CS 4120 Introduction to Compilers12Algebraic simplificationConstant folding conditionalsif (true) SSif (false) S;• More general form of constant folding: take advantage of simplificationrulesif (true) S else S’Sif (false) S else S’S’while (false) Sif (2 > 3) Sa*1 aa+0 ab | false b(a + 1) + 2;if (false) S0b & truea + (1 + 2)a*4a shl 2a*7(a shl 3) − aa / 32767a+3a shr 15 + a shr 30bidentitiesreassociationstrength reduction• Must be careful with floating point and with overflow - algebraicidentities may give wrong or less precise answers.– E.g., (a+b)+c ≠ a+(b+c) in floating point if a,b small.;CS 4120 Introduction to Compilersa*013CS 4120 Introduction to Compilers14InliningUnreachable code elimination• Replace a function call with the body of the function:• Basic blocks not contained by any traceleading from starting basic block areunreachable and can be eliminated• Performed at canonical IR or assemblycode levels• Reductions in code size improve cache,TLB performance.f(a: int):int = { b:int=1; n:int = 0;while (n<a) (b = 2*b); return b; }g(x: int):int = { return 1+ f(x); }g(x:int):int = { fx:int; { a:int = x;{ b:int=1; n:int = 0;while (n<a) ( b = 2*b); fx=b };return 1 + fx; }• Best done on HIR• Can inline methods, but more difficult – there can be only one f.• May need to rename variables to avoid name capture—consider iff refers to a global variable xCS 4120 Introduction to Compilers15CS 4120 Introduction to Compilers16SpecializationConstant propagation• If value of variable is known to be aconstant, replace use of variable withconstant• Value of variable must be propagatedforward from point of assignmentint x = 5;int y = x*2;int z = a[y]; // = MEM(MEM(a) + y*4)• Interleave with constant folding!• Idea: create specialized versions of functions (ormethods) that are called from different places w/different argsclass A implements I { m( ) {…} }class B implements I { m( ) {…} }f(x: I) { x.m( ); } // don’t know which ma = new A(); f(a) // know A.mb = new B(); f(b) // know B.m• Can inline methods when implementation is known• Impl.
known if only one implementing class• Can specialize inherited methods (e.g., HotSpot JIT)CS 4120 Introduction to Compilers17Dead code elimination• Given assignment x = y, replace subsequentuses of x with y• May make x a dead variable, result in dead codex = y*y;// dead!…// x unused …x = z*z;x = z*z;• Dead variable: if never read after defn.• Need to determine where copies of y propagatetowhile (m<n) (m++)CS 4120 Introduction to Compilers18Copy propagation• If side effect of a statement can never be observed,can eliminate the statementint i;while (m<n) ( m++; i = i+1)• Other optimizations createdead statements, variablesCS 4120 Introduction to Compilersx=yif (x > 1)x = x * f(x - 1)19x=yif (y > 1) {x = y * f(y - 1)CS 4120 Introduction to Compilers20Redundancy EliminationLoops• Common Subexpression Elimination(CSE) combines redundant computations• Program hot spots are usually loops (exceptions: OSkernels, compilers)• Most execution time in most programs is spent inloops: 90/10 is typical.• Loop optimizations are important, effective, andnumerousa(i) = a(i) + 1! [[a]+i*4] = [[a]+i*4] + 1! t1 = [a] + i*4; [t1] = [t1]+1• Need to determine that expression alwayshas same value in both placesb[j]=a[i]+1; c[k]=a[i] ! t1=a[i]; b[j]=t1+1; c[k]=t1 ?CS 4120 Introduction to Compilers21Loop-invariant code motion22Loop-invariant code motion• Another form of redundancy elimination• If result of a statement or expression does notchange during loop, and it has no externally-visibleside effect (!), can hoist its computation before loop• Often useful for array element addressingcomputations – invariant code not visible at sourcelevel• Requires analysis to identify loop-invariantexpressionsCS 4120 Introduction to CompilersCS 4120 Introduction to Compilers23for (i = 0; i < a.length; i++) {// a not assigned in loop}hoisted loop-invariant expressiont1 = a.length ;for (i = 0; i < t1; i++) {…}CS 4120 Introduction to Compilers24Strength reductionLoop unrolling• Replace expensive operations (*,/) by cheap ones(+, −) via dependent induction variable• Branches are expensive; unroll loop to avoidthem:for (i = 0; i<n; i++) { S }for (int i = 0; i < n; i++) {a[i*3] = 1;}int j = 0;for (int i = 0; i < n; i++) {a[ j ] = 1; j = j+3;}CS 4120 Introduction to Compilersfor (i = 0; i < n−3; i+=4) {S; S; S; S;}for ( ; i < n; i++) S;• Gets rid of ¾ of conditional branches!• Space-time tradeoff: not a good idea for largeS or small n.25Summary• Many useful optimizations that can transformcode to make it faster/smaller/...• Whole is greater than sum of parts:optimizations should be applied together,sometimes more than once, at different levels.• Problem: when are optimizations are safe andwhen are they effective?!Dataflow analysis!Control flow analysis!Pointer analysisCS 4120 Introduction to Compilers27CS 4120 Introduction to Compilers26.
Характеристики
Тип файла PDF
PDF-формат наиболее широко используется для просмотра любого типа файлов на любом устройстве. В него можно сохранить документ, таблицы, презентацию, текст, чертежи, вычисления, графики и всё остальное, что можно показать на экране любого устройства. Именно его лучше всего использовать для печати.
Например, если Вам нужно распечатать чертёж из автокада, Вы сохраните чертёж на флешку, но будет ли автокад в пункте печати? А если будет, то нужная версия с нужными библиотеками? Именно для этого и нужен формат PDF - в нём точно будет показано верно вне зависимости от того, в какой программе создали PDF-файл и есть ли нужная программа для его просмотра.