Cooper_Engineering_a_Compiler(Second Edition) (1157546), страница 11
Текст из файла (страница 11)
Compilers check for consistency of type; for example,the expressionType checkingthe compiler pass that checks for type-consistentuses of names in the input programa ← a × 2 × b × c × dmight be syntactically well-formed, but if b and d are character strings, thesentence might still be invalid. Compilers also check for consistency of number in specific situations; for example, an array reference should have thesame number of dimensions as the array’s declared rank and a procedurecall should specify the same number of arguments as the procedure’s definition. Chapter 4 explores some of the issues that arise in compiler-based typechecking and semantic elaboration.Intermediate RepresentationsThe final issue handled in the front end of a compiler is the generation ofan ir form of the code.
Compilers use a variety of different kinds of ir,depending on the source language, the target language, and the specific transformations that the compiler applies. Some irs represent the program as agraph. Others resemble a sequential assembly code program. The code inthe margin shows how our example expression might look in a low-level,sequential ir. Chapter 5 presents an overview of the variety of kinds of irsthat compilers use.For every source-language construct, the compiler needs a strategy for howit will implement that construct in the ir form of the code. Specific choicesaffect the compiler’s ability to transform and improve the code. Thus, wespend two chapters on the issues that arise in generation of ir for source-codeconstructs.
Procedure linkages are, at once, a source of inefficiency in thefinal code and the fundamental glue that pieces together different source filesinto an application. Thus, we devote Chapter 6 to the issues that surroundprocedure calls. Chapter 7 presents implementation strategies for most otherprogramming language constructs.t0t1t2t3a←←←←←a × 2t0 × bt1 × ct2 × dt314 CHAPTER 1 Overview of CompilationTERMINOLOGYA careful reader will notice that we use the word code in many placeswhere either program or procedure might naturally fit. Compilers can beinvoked to translate fragments of code that range from a single referencethrough an entire system of programs.
Rather than specify some scope ofcompilation, we will continue to use the ambiguous, but more general,term, code.1.3.2 The OptimizerWhen the front end emits ir for the input program, it handles the statementsone at a time, in the order that they are encountered. Thus, the initial irprogram contains general implementation strategies that will work in anysurrounding context that the compiler might generate. At runtime, the codewill execute in a more constrained and predictable context. The optimizeranalyzes the ir form of the code to discover facts about that context and usesthat contextual knowledge to rewrite the code so that it computes the sameanswer in a more efficient way.Efficiency can have many meanings.
The classic notion of optimization isto reduce the application’s running time. In other contexts, the optimizermight try to reduce the size of the compiled code, or other properties suchas the energy that the processor consumes evaluating the code. All of thesestrategies target efficiency.Returning to our example, consider it in the context shown in Figure 1.2a.The statement occurs inside a loop. Of the values that it uses, only a andd change inside the loop. The values of 2, b, and c are invariant in theloop. If the optimizer discovers this fact, it can rewrite the code as shown inFigure 1.2b.
In this version, the number of multiplications has been reducedfrom 4·n to 2·n + 2. For n > 1, the rewritten loop should execute faster. Thiskind of optimization is discussed in Chapters 8, 9, and 10.AnalysisData-flow analysisa form of compile-time reasoning about theruntime flow of valuesMost optimizations consist of an analysis and a transformation. The analysisdetermines where the compiler can safely and profitably apply the technique.Compilers use several kinds of analysis to support transformations.
Dataflow analysis reasons, at compile time, about the flow of values at runtime.Data-flow analyzers typically solve a system of simultaneous set equationsthat are derived from the structure of the code being translated. Dependenceanalysis uses number-theoretic tests to reason about the values that can be1.3 Overview of Translation 15b ← ···c ← ···a ← 1for i = 1 to nread da ← a × 2 × b × c × dendb ← ···c ← ···a ← 1t ← 2 × b × cfor i = 1 to nread da ← a × d × tend(a) Original Code in Context(b) Improved Coden FIGURE 1.2 Context Makes a Difference.assumed by subscript expressions. It is used to disambiguate references toarray elements. Chapter 9 presents a detailed look at data-flow analysis andits application, along with the construction of static-single-assignment form,an ir that encodes information about the flow of both values and controldirectly in the ir.TransformationTo improve the code, the compiler must go beyond analyzing it. The compiler must use the results of analysis to rewrite the code into a moreefficient form.
Myriad transformations have been invented to improve thetime or space requirements of executable code. Some, such as discoveringloop-invariant computations and moving them to less frequently executedlocations, improve the running time of the program. Others make the codemore compact. Transformations vary in their effect, the scope over whichthey operate, and the analysis required to support them. The literature ontransformations is rich; the subject is large enough and deep enough tomerit one or more separate books. Chapter 10 covers the subject of scalartransformations—that is, transformations intended to improve the performance of code on a single processor.
It presents a taxonomy for organizingthe subject and populates that taxonomy with examples.1.3.3 The Back EndThe compiler’s back end traverses the ir form of the code and emits codefor the target machine. It selects target-machine operations to implementeach ir operation. It chooses an order in which the operations will executeefficiently. It decides which values will reside in registers and which valueswill reside in memory and inserts code to enforce those decisions.16 CHAPTER 1 Overview of CompilationABOUT ILOCThroughout the book, low-level examples are written in a notation thatwe call ILOC—an acronym derived from "intermediate language for anoptimizing compiler." Over the years, this notation has undergone manychanges. The version used in this book is described in detail in Appendix A.Think of ILOC as the assembly language for a simple RISC machine. It has astandard set of operations.
Most operations take arguments that are registers. The memory operations, loads and stores, transfer values betweenmemory and the registers. To simplify the exposition in the text, mostexamples assume that all data consists of integers.Each operation has a set of operands and a target. The operation is writtenin five parts: an operation name, a list of operands, a separator, a list oftargets, and an optional comment. Thus, to add registers 1 and 2, leavingthe result in register 3, the programmer would writeadd r1 ,r2 ⇒ r3// example instructionThe separator, ⇒, precedes the target list. It is a visual reminder that information flows from left to right.
In particular, it disambiguates cases wherea person reading the assembly-level text can easily confuse operands andtargets. (See loadAI and storeAI in the following table.)The example in Figure 1.3 only uses four ILOC operations:ILOC OperationloadAIloadImultstoreAIr1 ,c2 ⇒ r3c1⇒ r2r1 ,r2 ⇒ r3r1⇒ r2 ,c3MeaningMemory(r1 +c2 ) → r3c1 → r2r1 × r2 → r3r1 → Memory(r2 +c3 )Appendix A contains a more detailed description of ILOC.
The examplesconsistently use rarp as a register that contains the start of data storagefor the current procedure, also known as the activation record pointer.Instruction Selectiont0t1t2t3a←←←←←a × 2t0 × bt1 × ct2 × dt3The first stage of code generation rewrites the ir operations into targetmachine operations, a process called instruction selection. Instructionselection maps each ir operation, in its context, into one or moretarget machine operations. Consider rewriting our example expression,a ← a × 2 × b × c × d, into code for the iloc virtual machine toillustrate the process.