978-0-262-31709-2-ch014 (1157536), страница 2
Текст из файла (страница 2)
The expression ϕ evaluates to a curried function which takes a compiled expressionand an uncompiled expression as arguments. The compiledexpression is a continuation; the uncompiled expression isthe source code to be compiled; applying the curried function to the halt bytecode yields a function which can compiletop-level expressions. Inserting a copy of (ϕ (make-halt))into the unquoted half of the quine so that it compiles itsresult (and mirroring this change in the quoted half) yieldsRelated Work(Stephenson et al., 2003) described a genetic programmingsystem which learns priority functions for compiler optimizations including hyperblock selection, register allocation, and data prefetching.
(D’haeseleer, 1994) describedand experimentally evaluated a method for context preserving crossover. (Kirshenbaum, 2000) demonstrated a geneticprogramming system where crossover is defined so that itrespects the meaning of statically defined local variables.Several authors have explored the idea of staged or alternating fitness functions.
(Koza et al., 1999) used a staged fitness function as a method for multi-objective optimization.(Pujol, 1999) described a system where the fitness functionis switched after a correct solution is discovered to a function which minimizes solution size. (Zou and Lung, 2004)and (Offman et al., 2008) used alternating fitness functionsto preserve diversity in genetic algorithm derived solutionsto problems in water quality model calibration and proteinmodel selection.((lambda ((ϕ (make-halt))(list %0 (list quote %0))))(quote (lambda ((ϕ (make-halt))(list %0 (list quote %0))))))which, although not a quine itself, returns a quine when evaluated. Significantly, this quine is not a source code fixedpoint of the Skeme interpreter but an object code fixed-pointof Dybvig’s virtual machine.
In effect, it is a quine in alow-level language (phenotype) which reproduces by compiling a compressed self-description written in a high-levellanguage (genotype).In prior work on evolution of self-replicating programsthere has been no distinction between phenotype and genotype; mutations are made on the same representation whichis evaluated for fitness. In contrast, in living organisms,small changes in genotype due to mutation can be amplifiedby a development process and result in large changes in phenotype; it is phenotype which is then evaluated for fitness. InECAL 2013if(index? %0)ifGenetic ProgrammingOur approach to genetic programming is motivated by thefact that gene duplication followed by specialization of oneor both copies is a common route to increased complexityin biological evolution(Finnigan et al., 2012). We introduce88ECAL - General TrackZC((lambda … )((%4 (make-argument ((%5 (make-apply)) %1)))(car %0)))Cp (bloat)q (shrink)DqCq(shrink)(shrink)p (bloat)q (shrink)Dqq(shrink)(shrink)if(null? (cdr %1))if(eq? (make-return) %5)(make-frame%5((%6 (make-argument %0))(car (cdr %1))))((%6 (make-argument %0))(car (cdr %1)))Bpppp(bloat)(bloat)(bloat)(bloat)A(eq? %5 (make-return))p (bloat)q (shrink)BAp (bloat)q (shrink)BFigure 5: Contour plots of fitness landscapes during flush(left) and lean (right) epochs.
Colored arrows point in directions of increased fitness. In lean epochs, the four genotypes A, B, C, and D occupy islands separated by valleysof decreased fitness; the bloat mutations necessary for Ato evolve into any of the other genotypes are harmful sincethey increase the cost of self-replication.
In contrast, theshrink mutations required for A to evolve into any of theother genotypes are beneficial. In flush epochs, the situationis reversed–the bloat mutations are beneficial and the shrinkmutations are harmful since they increase and decrease effective complexity respectively. Alternating between thetwo fitness functions creates paths between the A and Dgenotypes consisting solely of beneficial mutations.if(make-frame %5 %0)%0Figure 4: Evolved subtrees implementing the tail-call optimizations which characterize the B and C genotypes.
The Agenotype performs neither optimization while the D genotype performs both. Both optimizations check to see if thecontinuation is a return bytecode, which performs a framestack pop. If so, the push-pop sequence is not generated,resulting in significant savings in time and space usage.two mutation operators called bloat and shrink which playroles analogous to gene duplication and specialization andemploy these in a genetic programming system where fitnessalternates between object code based definitions of complexity and self-replication efficiency.
In teleological terms, thebloat operator attempts to increase complexity by addingsource code while the shrink operator attempts to increaseself-replication efficiency by removing it.it; it is value-neutral with respect to evaluation. Because (bytheir nature) they increase the cost of self-replication without breaking the compiler, bloat mutations (although neverlethal) are harmful during lean epochs.In contrast, shrink mutations are beneficial when they reverse bloat mutations during lean epochs and can be harmful when they reverse bloat mutations during flush epochs.However, shrink mutations have two different and more pronounced effects. First, a shrink mutation can remove codeand break the compiler, in which case it is lethal.
Second,it can shape the result of a bloat mutation in a way whichdecreases the cost of self-replication, in which case it willbe strongly beneficial during lean epochs and become fixedin the population.Alternating Fitness FunctionTime is divided into ten generation periods termed epochswhich alternate between two types, flush and lean. In flushepochs, fitness is defined as effective complexity while inlean epochs it is defined as self-replication efficiency.A test bytecode is defined to be non-trivial if both of itscontinuations are exercised in the course of self-replication.This will only happen if the predicate expression in theif special-form from which the test bytecode is compiledsometimes evaluates to true and sometimes to false.
Thenumber of non-trivial test bytecodes in the object code isa good measure of the source code’s effective complexity.Consequently, in flush epochs the number of non-trivial testbytecodes in the object code is maximized.Because frame stack pushes and pops are the most expensive operation performed by the virtual machine, theyare an excellent proxy for overall self-replication cost. Consequently, in lean epochs, the number of frame stack pops,which are implemented by the return bytecode, is minimized.Mutations can be classified as beneficial, neutral, harmful, and lethal. The purpose of the bloat operator is to introduce source code which can be shaped by the shrink operator and by crossover.
Significantly, the introduced codedoes not change the value of any expression which containsBloatThe source code for the self-hosting compiler containsboolean-valued expressions with six different syntacticforms. Excluding primitive functions, the source code contains six different expressions of constant value. A randomsyntactic form can be combined with a random de Bruijinindex and (if necessary), a random constant-valued expression, to construct a random boolean-valued expression, φ .The bloat operator is defined by five rules. The first fourrules define a recursive procedure which applies the bloatoperator in selected contexts. The last rule replaces a function application with an i f expression which returns thesame value regardless of whether a random boolean-valuedexpression, φ , evaluates to true or false.
Consequently, the89ECAL 2013ECAL - General Trackvalue of the expression is the same before and after the mutation. The fact that the bloat operator is value-neutral withrespect to evaluation is important because only viable individuals (those which correctly self-replicate) are copiedto the next generation; and although a bloat mutation typically introduces expressions which are not evaluated during self-replication (which greatly reduces the fitness of affected individuals by increasing their self-replication costs)affected individuals always remain viable because bloat mutations cannot actually break the compiler which containsthem. The five rules which define the bloat operator areTable 1: Complexities and self-replication costs.non-trivial testsreturnsB9333C9432D10183where f is a primitive function, id is the identity function, and primes mark expressions which are recursively expanded.
Alternative right hand sides are separated by vertical bars; the alternative to the left of the || (no mutation) ischosen with 95% probability; one of the remaining alternatives (mutation) is chosen otherwise (each with equal probability). Unlike the bloat operator, which is value neutral,the shrink operator changes the object code generated by thecompiler when it modifies an expression which is evaluatedduring self-replication. In the case of the fourth shrink rule,this often reverses a harmful bloat mutation, in which casethe shrink mutation is beneficial.
However, in the case of thelast shrink rule, the mutation most often breaks the compiler.Very rarely, the shrink mutation does not break the compilerbut instead results in a decrease in self-replication cost.1. (lambda[+] e1 ) → (lambda[+] e1 )2. ((lambda[+] e1 ) e2 ) → ((lambda[+] e1 ) e2 )3. (if e1 (id e2 ) e3 ) → (if e1 (id e2 ) e3 )4. (if e1 e2 e3 ) → (if e1 e2 e3 )5. ( f e1 . .
. eN ) → ( f e1 . . . eN ) (if φ (id ( f e1 . . . eN )) ( f e1 . . . eN ))where f is a primitive function, φ is a random booleanvalued expression, id is the identity function, and primesmark expressions which are recursively expanded. Alternative right hand sides are separated by vertical bars; the alternative to the left of the || (no mutation) is chosen with 95%probability; the remaining alternative (mutation) is chosenotherwise. The identity function serves as a value neutraltag in a meta-syntax; because the third rule has the same leftand right hand sides, the recursive procedure which appliesthe bloat operator will not descend into i f subtrees markedwith this tag; this prevents the compounding of bloat mutations.The problem which plagues many genetic programmingsystems, in which code trees grow larger with increasingtime, does not occur for two reasons.