978-0-262-31709-2-ch014 (1157536)
Текст из файла
DOI: http://dx.doi.org/10.7551/978-0-262-31709-2-ch014ECAL - General TrackEvolution of Tail-Call Optimization in a Population of Self-Hosting CompilersLance R. Williams11 Departmentof Computer Science, University of New Mexico, Albuquerque, NM 87131williams@cs.unm.eduAbstractchine language program to be interesting if it prints a stringat least as long as itself and halts when executed, and observethat the Kolmogorov complexity of interesting programs islower than that of random strings of similar length. Now,if we were to train an adaptive compression algorithm on alarge set of interesting programs, then the compressed programs which result would look like random strings.
However, by virtue of being shorter, they would be more numerous relative to truly random strings of similar length. Itfollows that compression, which decreases redundancy byreplacing recurring sequences of instructions with inventednames, increases the density of interesting programs.Since both processes increase redundancy and output machine language programs, it is natural to identify decompression with compilation, which increases redundancy by repeatedly generating similar sequences of instructions whiletraversing a parse tree. Viewed this way, programs written in(more expressive) high-level languages are compressed machine language programs, and compiling is the process ofdecompressing source code strings into object code stringswhich can be executed by a CPU.If the density of interesting programs increases with theexpressiveness of the language in which they are encoded (asthe above strongly suggests), then one should use the mostexpressive language possible for any process, like geneticprogramming, which involves searching the space of interesting programs.
However, if the goal is building artificialorganisms, then high-level languages have a very seriousdrawback when compared to machine language. Namely,programs in high-level languages must be compiled into machine language before they can be executed by a CPU or bereified as a distributed virtual machine(Williams, 2012).Given that we want our self-replicating programs to beboth (potentially) reifiable and to evolve into programs ofgreater complexity and efficiency, we must ask: How canthe advantages which derive from the use of a high-level language for genetic programming be reconciled with the factthat only machine language programs can be reified?To address this question, we introduce a new and significantly more complex kind of artificial organism–a ma-We demonstrate the evolution of a more complex and moreefficient self-replicating computer program from a less complex and less efficient ancestor.
Both programs, which employ a novel method of self-replication based on compilingtheir own source code, are significantly more complex thanprograms which reproduce by copying themselves, and whichhave only exhibited evolution of degenerate methods of selfreplication.IntroductionAmong living organisms, which employ many and variedmechanisms in the process of reproduction, examples ofevolved mechanisms which are both more complex andmore efficient than ancestral mechanisms, abound. Yet,nearly twenty years after (Ray, 1994)’s groundbreakingwork on the Tierra system, in which the evolution of manynovel (but degenerate) methods of self-replication was firstdemonstrated, there is still no example of a more complexand more efficient self-replicating computer program evolving from a less complex and less efficient ancestor.This is not to say that there has been no progress in thefield of artificial life since Tierra.
Nor are we suggesting that increased reproductive efficiency is the only evolutionary path to increased complexity. The evolution ofself-replicating programs of increased complexity has beendemonstrated many times(Koza, 1994; Taylor and Hallam,1997; Spector and Robinson, 2002), and perhaps most convincingly in the Avida system(Adami et al., 1994). However, more complex programs evolved in Avida only because complexity was artificially equated with efficiency inthe sense that programs which learned to solve problemsunrelated to self-replication were rewarded with larger rations of CPU time.
No program in Avida (or in any othersystem known to us) has ever evolved a method of selfreplication that is both more complex and more efficient thanthe method employed by its ancestor.A New Kind of Artificial OrganismSelf-replicating programs have been written in both highlevel languages and machine languages. We define a ma-ECAL 201386ECAL - General Trackregistersdaughterdaughtercopycompilecopypc1accumulatorhalt3...argumentsargumentcompilevalconstant+copycompilationcompilecopyvalframe*32environmentframecopyvalframescopymotherconstantcompileargumentmotherFigure 1: Conventional self-replicating program (left)copies itself by exploiting program-data equivalence of vonNeumann architecture. Compiling quine self-replicatingprogram (right) with source code genotype (green) and object code phenotype (red).
Because the shortest correct implementation of copy is optimal, only the compiling quine iscapable of non-degenerate evolution.argument stack2constantargumentargument1constantconstantapplyapply+*valpcargsenvvalpcargsenv...valenvironment stack...pcargsenvframe stackFigure 2: Virtual machine for evaluating compiled Schemeexpressions showing its registers and associated heapallocated data structures(Dybvig, 1987).chine language program which reproduces by compilingits own source-code. See Figure 1. Conventional selfreplicating programs reproduce by copying themselves.
Optimum copiers accomplish this in time proportional to theirlength, and it is not very hard to write a copier which is optimum in this sense (or for one to evolve). It follows thatshorter implementations are always more efficient, whichleads to degenerate evolution, absent factors beyond efficiency. The possible variation in the implementation of acompiler is far larger. Even if the definition of the objectlanguage is stipulated, there is still a huge space of alternative implementations, including the syntax and semantics ofthe source language, the ordering of the decision tree performing syntactic analysis, and the presence (or absence)and effectiveness of any object code optimizing procedures.In this paper we describe a machine language programwhich reproduces by compiling its own source code anduse genetic programming to demonstrate its capacity fornon-degenerate evolution.
In the process we address questions such as: How can a program like a compiler, whichimplements a complex prescribed transformation, evolveimprovements while avoiding non-functional intermediateforms? How can two lexically scoped programs be combined by crossover without breaking the product? How cana more efficient self-replicating program evolve from a lessefficient ancestor when all mutations initially yield higherself-replication cost?ment; user defined functions with more than one argumentmust be written in a curried style.
This simplifies the representation of the lexical environment which is used at runtime by making all variable references integer offsets into aflat environment stack; these are termed de Bruijn indicesand can be used instead of symbols to represent bound variables(De Bruijn, 1972).One feature peculiar to Skeme is the special-form,lambda+. When a closure is created by lambda+, the closure’s address is added to the front of the enclosed environment; the de Bruijn index for this address can then beused for recursive function calls. For example, the following function computes factorial:(lambda+ (if (= %0 0) 1 (* %0 (%1 (- %0 1)))))where %0 is a reference to the closure’s argument and %1 isa reference to the closure’s address.Tail-Call OptimizationBecause the very first self-hosting compiler was written inLisp, it is not surprising that it is possible (by includingprimitive functions which construct bytecode types) to writea very small self-hosting compiler in Skeme.
See Figures 2and 3.The cost of compiling a given source code depends notonly on its size, but also on the complexity of the sourcelanguage, the efficiency of the compiler, and the cost ofany object code optimizations it performs. Common compiler optimizations include constant folding, loop unrolling,function inlining, loop-invariant code motion, elimination ofcommon subexpressions, and dead code elimination. Sincea self-hosting compiler compiles itself, the efficiency of theobject code it generates also affects compilation cost; it follows that minimizing the cost of self-compilation involvesa complex set of tradeoffs. The most important of these isthat object code optimizations must pay for themselves byyielding an increase in object code efficiency large enoughA Simple Programming LanguageBecause a self-hosting compiler compiles the same languageit is written in, it can compile itself.
The language we usedto construct our self-hosting compiler is a pure functionalsubset of Scheme which we call Skeme. Because it is purelyfunctional, define, which associates values with names in aglobal environment using mutation, and letrec, which alsouses mutation, have been excluded. The global environment itself is eliminated by making primitive functions constants. For simplicity, closures are restricted to one argu-87ECAL 2013ECAL - General Trackto offset the additional cost of compiling the source codeimplementing the optimization.Most of the overhead associated with a function call involves the saving and restoration of evaluation contexts. InSkeme, these operations are performed by the frame and return bytecodes which push and pop the frame stack.
However, when one function calls another function in a tail position, there is no need to save an evaluation context, because the restored context will just be discarded when thefirst function returns. A compiler which performs tail-calloptimization recognizes when a function is called in a tailposition and does not generate the code which saves and restores evaluation contexts. This not only saves time, it alsosaves space, since tail recursive function calls will not increase the size of the frame stack at runtime.(lambda+ (lambda+ … ))if(pair? %0)((lambda ((lambda … ) (cdr %1))) (car %0))(eq? %1 if)X((%5 (make-test(%3 (car (cdr %0)))(%3 (car (cdr (cdr %0))))))(car %0))(eq? %1 lambda)if(null? %0)(make-constant %0 %2)(make-refer %0 %2)if(make-close((%5 (make-return))(car %0))%4)(eq? %1 lambda+)(eq? %1 quote)ififX(make-klose((%5 (make-return))(car %0))%4)Z(make-frame((lambda … )%4((%5 (make-argument ((%5 (make-apply)) %1)))((%5 (make-apply))(car %0)))%1))Y(make-constant ((lambda+ … ) (car %0)) %4)if(null? (cdr %1))A Quine which Compiles Itself(make-frame %5 %0)if(pair? %0)A quine is a program which prints itself.
It is possible towrite a quine in any programming language but Skeme’s listbased syntax makes it possible to write especially short andsimple quines. For example, in the following Skeme quine,an expression (lambda (list %0 (list quote %0))) which evaluates to a closure which appends a value to the same valuequoted is applied to the same expression quoted:(copy-atom %0)(cons (%1 (car %0)) (%1 (cdr %0)))(make-frame%5((%6 (make-argument %0))(car (cdr %1))))Figure 3: An expression ϕ for compiling Skeme into objectcode able to compile itself. The X indicates a break in thefigure; the subtree labeled Y copies the Skeme source codeand the subtree labeled Z compiles function applications.a compiling quine, small changes in source code (genotype)are amplified by compilation (development) yielding muchlarger changes in object code (phenotype) and it is objectcode which determines fitness, since its execution consumesthe physical resources of space and time.((lambda (list %0 (list quote %0)))(quote (lambda (list %0 (list quote %0)))))It is possible to define an expression ϕ in Skeme whichcan compile any Skeme expression.
Характеристики
Тип файла PDF
PDF-формат наиболее широко используется для просмотра любого типа файлов на любом устройстве. В него можно сохранить документ, таблицы, презентацию, текст, чертежи, вычисления, графики и всё остальное, что можно показать на экране любого устройства. Именно его лучше всего использовать для печати.
Например, если Вам нужно распечатать чертёж из автокада, Вы сохраните чертёж на флешку, но будет ли автокад в пункте печати? А если будет, то нужная версия с нужными библиотеками? Именно для этого и нужен формат PDF - в нём точно будет показано верно вне зависимости от того, в какой программе создали PDF-файл и есть ли нужная программа для его просмотра.