Chapter 8 Theory and Practice of simulated Annealing (1121211), страница 5
Текст из файла (страница 5)
Note that a criticismof simulated annealing is that it is completely memoryless (i.e., simulated annealingdisregards all historical information gathered during the algorithm’s execution). On theother hand, no proofs of convergence exist in the literature for the general tabu searchalgorithm.Faigle and Kern (1992) proposes a particular tabu search algorithm called probabilistic tabu search (Glover, 1989) as a meta-heuristic to help guide simulated annealing.Probabilistic tabu search attempts to capitalize on both the asymptotic optimality ofsimulated annealing and the memory feature of tabu search.
In probabilistic tabu search,the probabilities of generating and accepting each candidate solution are set as functionsof both a temperature parameter (as in simulated annealing) and information gainedin previous iterations (as for tabu search). Faigle and Kern (1992) are then able toprove asymptotic convergence of their particular tabu search algorithm by using methods developed for simulated annealing (Faigle and Kern, 1991; Faigle and Schraeder,1988). Note that the results of Faigle and Kern (1992) build upon Glover (1989) whereprobabilistic tabu search was first introduced and contrasted with simulated annealing.3.4Genetic AlgorithmsGenetic algorithms (Liepens and Hilliard, 1989) emulate the evolutionary behavior ofbiological systems. They generate a sequence of populations of qcandidate solutions tothe underlying optimization problem by using a set of genetically inspired stochasticsolution transition operators to transform each population of candidate solutions into adescendent population.
The three most popular transition operators are reproduction,cross-over, and mutation (Davis, 1991). Davis and Principe (1991) and Rudolph (1994)attempt to use homogeneous finite Markov chain techniques to prove convergence ofgenetic algorithms (Cerf, 1998), but are unable to develop a theory comparable in scopeto that of simulated annealing.Mühlenbein (1997) presents a theoretical analysis of genetic algorithms based onpopulation genetics. He counters the popular notion that models that mimic natural phenomenon are superior to other models. The article argues that evolutionary algorithmscan be inspired by nature, but do not necessarily have to copy a natural phenomenon.He addresses the behavior of transition operators and designs new genetic operatorsthat are not necessarily related to events in nature, yet still perform well in practice.One criticism of simulated annealing is the slow speed at which it sometimes converges.
Delport (1998) combines simulated annealing with evolutionary algorithmsto improve performance in terms of speed and quality of solution. He improves thishybrid system of simulated annealing and evolutionary selection by improving the cooling schedule based on fast recognition of the thermal equilibrium in terms of selectionintensity.
This technique results in much faster convergence of the algorithm.Sullivan and Jacobson (2000) links genetic algorithms with simulated annealingusing generalized hill climbing algorithms (Jacobson et al., 1998). They first linkgenetic algorithms to ordinal hill climbing algorithms, which can then be used, through302D. Henderson et al.its formulation within the generalized hill climbing algorithm framework, to form abridge with simulated annealing. Though genetic algorithms have proven to be effective for addressing intractable discrete optimization problems, and can be classifiedas a type of hill-climbing approach, its link with generalized hill climbing algorithms(through the ordinal hill climbing formulation) provides a means to establish welldefined relationships with other generalized hill climbing algorithms (like simulatedannealing and threshold accepting).
They also present two formulations of genetic algorithms that provide a first step towards developing a bridge between genetic algorithmsand other local search strategies like simulated annealing.4 PRACTICAL GUIDELINESImplementation issues for simulated annealing can follow one of two paths—thatof specifying problem-specific choices (neighborhood, objective function, and constraints), and that of specifying generic choices (generation and acceptance probabilityfunctions, and the cooling schedule) (Eglese, 1990). The principal shortcoming ofsimulated annealing is that it often requires extensive computer time.
Implementationmodifications generally strive to retain simulated annealing’s asymptotic convergencecharacter, but at reduced computer run-time. The methods discussed here are mostlyheuristic.Problem-Specific ChoicesObjective Functions One problem-specific choice involves the objective functionspecification. Stern (1992) recommends a heuristic temperature-dependent penaltyfunction as a substitute for the actual objective function for problems where low costsolutions have neighbors of much higher cost, or in cases of degeneracy (i.e., largeneighborhoods of solutions of equal, but high costs). The original objective functionsurfaces, as the penalty and the temperature are gradually reduced to zero. This technique is similar to the noising method presented by Charon and Hudrey (1993), wherethe penalty function is described as noise and is reduced at each outer loop iteration ofthe algorithm.
One speed-up technique is to evaluate only the difference in objectivefunctions,instead of calculating bothandTovey (1988) suggestsseveral methods of approximatingby using surrogate functions (that are fasterto evaluate thanbut not as accurate) probabilistically for cases when evaluationofis expensive; this technique is referred to as the surrogate function swindle.Straub et al. (1995) improves the performance of simulated annealing on problemsin chemical physics by using the classical density distribution instead of the moleculardynamics of single point particles to describe the potential energy landscape.
Ma andStraub (1994) reports that using this distribution has the effect of smoothing the energylandscape by reducing both the number and depth of local minima.Yan and Mukai (1992) considers the case when a closed-form formula for the objective function is not available. They use a probabilistic simulation (termed the stochasticruler method) to generate a sample objective function value for an input solution, andthen accept the solution if the sample objective function value falls within a predetermined bound. They also provide a proof of asymptotic convergence by extrapolatingthe convergence proofs for simulated annealing, and analyze the rate of convergence.The Theory and Practice of Simulated Annealing 303Generic ChoicesGeneration Probability Functions Generation probability functions are usually chosen as uniform distributions with probabilities proportional to the size of the neighborhood.
The generation probability function is typically not temperature-dependent.Fox (1993) suggests that instead of blindly generating neighbors uniformly, adopt anintelligent generation mechanism that modifies the neighborhood and its probabilitydistribution to accommodate search intensification or diversification, in the same spiritof the tabu search meta-heuristic. Fox (1993) also notes that simulated annealing convergence theory does not preclude this idea.
Tovey (1988) suggests an approach witha similar effect, called the neighborhood prejudice swindle.Acceptance Probability Functions The literature reports considerable experimentation with acceptance probability functions for hill-climbing transitions. The mostpopular is the exponential form (1).
Ogbu and Smith (1990) considers replacing thebasic simulated annealing acceptance functionwith a geometrically decreasing form that is independent of the change in objective function value. They adopt aprobabilistic-exhaustive heuristic technique in which randomly chosen neighbors ofa solution are examined and all solutions that are accepted are noted, but only thelast solution accepted becomes the new incumbent.
The hope is that this scheme willexplore a broader area of the solution space of a problem. Their acceptance probabilityfunction is defined for all solutionsand for k = 1, 2, . . . , K aswhereis the initial acceptance probability value,is a reducing factor, andK is the number of stages (equivalent to a temperature cooling schedule). They alsoexperiment with this method (and a neighborhood of large cardinality) on a permutationflow shop problem, and reports that its approach found comparable solutions to the basicsimulated annealing algorithm in one-third the computation time.4.1 Choice of Cooling ScheduleThe simulated annealing cooling schedule is fully defined by an initial temperature, aschedule for reducing/changing the temperature, and a stopping criterion. Romeo andSangiovanni-Vincentelli (1991) notes that an effective cooling schedule is essential toreducing the amount of time required by the algorithm to find an optimal solution.Therefore much of the literature on cooling schedules (e.g., Cardoso et al., 1994;Fox and Heine, 1993; Nourani and Andersen, 1998; and Cohn and Fielding, 1999) isdevoted to this topic.Homogeneous simulated annealing convergence theory has been used to designeffective cooling schedules.
Romeo and Sangiovanni-Vincentelli (1991) suggests thefollowing procedure for designing a cooling schedule:for which a good approximation of the1. Start with an initial temperaturestationary distributionis quickly reached.304D. Henderson et al.small enough such thatis a good starting point2. Reduce by an amountto approximate3. Fix the temperature at a constant value during the iterations needed for thesolution distribution to approximateRepeat the above process of cooling and iterating until no further improvement seemspossible.Cooling schedules are grouped into two classes: static schedules, which must becompletely specified before the algorithm begins; and adaptive schedules, which adjustthe temperature’s rate of decrease from information obtained during the algorithm’s execution.
Cooling schedules are almost always heuristic; they seek to balance moderateexecution time with simulated annealing’s dependence on asymptotic behavior.Strenski and Kirkpatrick (1991) presents an exact (non-heuristic) characterizationof finite-length annealing schedules. They consider extremely small problems thatrepresent features (local optima and smooth/hilly topologies), and solve for solutionprobabilities after a finite number of iterations to gain insights into some popularassumptions and intuition behind cooling schedules.