K. Cooper, L. Torczon - Engineering a Compiler (2011 - 2nd edition) (798440), страница 9
Текст из файла (страница 9)
Java is compiled from source code into a form calledbytecode, a compact representation intended to decrease download times forJava applications. Java applications execute by running the bytecode on thecorresponding Java Virtual Machine (jvm), an interpreter for bytecode. Tocomplicate the picture further, many implementations of the jvm include aVirtual machineA virtual machine is a simulator for someprocessor.
It is an interpreter for that machine’sinstruction set.4 CHAPTER 1 Overview of Compilationcompiler that executes at runtime, sometimes called a just-in-time compiler,or jit, that translates heavily used bytecode sequences into native code forthe underlying computer.Interpreters and compilers have much in common. They perform many of thesame tasks. Both analyze the input program and determine whether or not itis a valid program.
Both build an internal model of the structure and meaning of the program. Both determine where to store values during execution.However, interpreting the code to produce a result is quite different fromemitting a translated program that can be executed to produce the result. Thisbook focuses on the problems that arise in building compilers. However, animplementor of interpreters may find much of the material relevant.Why Study Compiler Construction?A compiler is a large, complex program. Compilers often include hundredsof thousands, if not millions, of lines of code, organized into multiple subsystems and components. The various parts of a compiler interact in complexways. Design decisions made for one part of the compiler have important ramifications for other parts. Thus, the design and implementation ofa compiler is a substantial exercise in software engineering.A good compiler contains a microcosm of computer science.
It makes practical use of greedy algorithms (register allocation), heuristic search techniques(list scheduling), graph algorithms (dead-code elimination), dynamic programming (instruction selection), finite automata and push-down automata(scanning and parsing), and fixed-point algorithms (data-flow analysis). Itdeals with problems such as dynamic allocation, synchronization, naming, locality, memory hierarchy management, and pipeline scheduling. Fewsoftware systems bring together as many complex and diverse components. Working inside a compiler provides practical experience in softwareengineering that is hard to obtain with smaller, less intricate systems.Compilers play a fundamental role in the central activity of computerscience: preparing problems for solution by computer.
Most software is compiled, and the correctness of that process and the efficiency of the resultingcode have a direct impact on our ability to build large systems. Most studentsare not satisfied with reading about these ideas; many of the ideas must beimplemented to be appreciated. Thus, the study of compiler construction isan important component of a computer science education.Compilers demonstrate the successful application of theory to practicalproblems. The tools that automate the production of scanners and parsersapply results from formal language theory. These same tools are used for1.1 Introduction 5text searching, website filtering, word processing, and command-languageinterpreters. Type checking and static analysis apply results from lattice theory, number theory, and other branches of mathematics to understand andimprove programs. Code generators use algorithms for tree-pattern matching, parsing, dynamic programming, and text matching to automate theselection of instructions.Still, some problems that arise in compiler construction are open problems—that is, the current best solutions have room for improvement.
Attempts todesign high-level, universal, intermediate representations have foundered oncomplexity. The dominant method for scheduling instructions is a greedyalgorithm with several layers of tie-breaking heuristics. While it is obviousthat compilers should use commutativity and associativity to improve thecode, most compilers that try to do so simply rearrange the expression intosome canonical order.Building a successful compiler requires expertise in algorithms, engineering,and planning. Good compilers approximate the solutions to hard problems.They emphasize efficiency, in their own implementations and in the codethey generate. They have internal data structures and knowledge representations that expose the right level of detail—enough to allow strongoptimization, but not enough to force the compiler to wallow in detail.Compiler construction brings together ideas and techniques from across thebreadth of computer science and applies them in a constrained setting tosolve some truly hard problems.The Fundamental Principles of CompilationCompilers are large, complex, carefully engineered objects.
While manyissues in compiler design are amenable to multiple solutions and interpretations, there are two fundamental principles that a compiler writer mustkeep in mind at all times. The first principle is inviolable:The compiler must preserve the meaning of the program being compiled.Correctness is a fundamental issue in programming.
The compiler mustpreserve correctness by faithfully implementing the “meaning” of its inputprogram. This principle lies at the heart of the social contract between thecompiler writer and compiler user. If the compiler can take liberties withmeaning, then why not simply generate a nop or a return? If an incorrecttranslation is acceptable, why expend the effort to get it right?The second principle that a compiler must observe is practical:The compiler must improve the input program in some discernible way.6 CHAPTER 1 Overview of CompilationA traditional compiler improves the input program by making it directlyexecutable on some target machine.
Other “compilers” improve their inputin different ways. For example, tpic is a program that takes the specification for a drawing written in the graphics language pic and converts it intoLATEX; the “improvement” lies in LATEX’s greater availability and generality.A source-to-source translator for c must produce code that is, in some measure, better than the input program; if it is not, why would anyone invoke it?1.2 COMPILER STRUCTUREA compiler is a large, complex software system.
The community has beenbuilding compilers since 1955, and over the years, we have learned manylessons about how to structure a compiler. Earlier, we depicted a compiler asa simple box that translates a source program into a target program. Reality,of course, is more complex than that simple picture.As the single-box model suggests, a compiler must both understand thesource program that it takes as input and map its functionality to the targetmachine. The distinct nature of these two tasks suggests a division of laborand leads to a design that decomposes compilation into two major pieces: afront end and a back end.SourceProgramFront EndIRBack EndTargetProgramCompilerThe front end focuses on understanding the source-language program.
Theback end focuses on mapping programs to the target machine. This separation of concerns has several important implications for the design andimplementation of compilers.IRA compiler uses some set of data structures torepresent the code that it processes. That form iscalled an intermediate representation, or IR.The front end must encode its knowledge of the source program in somestructure for later use by the back end.
This intermediate representation (ir)becomes the compiler’s definitive representation for the code it is translating.At each point in compilation, the compiler will have a definitive representation. It may, in fact, use several different irs as compilation progresses,but at each point, one representation will be the definitive ir.
We think ofthe definitive ir as the version of the program passed between independentphases of the compiler, like the ir passed from the front end to the back endin the preceding drawing.In a two-phase compiler, the front end must ensure that the source programis well formed, and it must map that code into the ir. The back end must map1.2 Compiler Structure 7MAY YOU STUDY IN INTERESTING TIMESThis is an exciting era in the design and implementation of compilers. Inthe 1980s, almost all compilers were large, monolithic systems.
They tookas input one of a handful of languages and produced assembly code forsome particular computer. The assembly code was pasted together withthe code produced by other compilations—including system libraries andapplication libraries—to form an executable. The executable was storedon a disk, and at the appropriate time, the final code was moved from thedisk to main memory and executed.Today, compiler technology is being applied in many different settings.
Ascomputers find applications in diverse places, compilers must cope withnew and different constraints. Speed is no longer the sole criterion forjudging the compiled code. Today, code might be judged on how smallit is, on how much energy it consumes, on how well it compresses, or onhow many page faults it generates when it runs.At the same time, compilation techniques have escaped from the monolithic systems of the 1980s.