The TwoArmed Bandit Problem
The tradeoff between exploration and exploitation can be instructively modeled in a simple scenario the Two-Armed Bandit problem. This problem has been studied extensively in the context of statistical decision theory and adaptive control e.g., see Bellman 1961 . Holland 1975 used it as an as a mathematical model of how a GA allocates samples to schemas. The scenario is as follows. A gambler is given N coins with which to play a slot machine having two arms. A conventional slot machine is...
Sketch of a Solution
The following is a sketch of Holland's solution to the Two-Armed Bandit problem. For the sake of clarity and brevity I make a number of simplifications and leave out some details that make the solution more rigorous and general. The full solution involves mathematical subtleties that go beyond the scope of this book. Interested readers should consult chapter 5 of Holland 1975, and should also see the corrections in chapter 10 of the second edition. Let A1 be the arm with higher average payoff...
Predicting Dynamical Systems
Norman Packard 1990 has developed a form of the GA to address this problem and has applied his method to several data analysis and prediction problems. The general problem can be stated as follows A series of observations from some process e.g., a physical system or a formal dynamical system take the form of a set of pairs, where , are independent variables and y is a dependent variable 1didN . For example, in a weather prediction task, the independent variables might be some set of features of...
Thought Exercises
How many Prisoner's Dilemma strategies with a memory of three games are there that are behaviorally equivalent to TIT FOR TAT What fraction is this of the total number of strategies with a memory of three games What is the total payoff after 10 games of TIT FOR TAT playing against a a strategy that always defects b a strategy that always cooperates c ANTI-TIT-FOR-TAT, a strategy that starts out by defecting and always does the opposite of what its opponent did on the last move d What is the...
Computer Exercises
Asterisks indicate more difficult, longer-term projects. 1. Implement a simple GA with fitness-proportionate selection, roulettewheel sampling, population size 100, single-point crossover rate pc 0.7, and bitwise mutation rate pm 0.001. Try it on the following fitness function f x number of ones in x, where x is a chromosome of length 20. Perform 20 runs, and measure the average generation at which the string of all ones is discovered. Perform the same experiment with crossover turned off...
Computer Exercises Ngq
Implement SUS and use it on the fitness function described in computer exercise 1 in chapter 1. How does this GA differ in behavior from the original one with roulette-wheel selection Measure the spread the range of possible actual number of offspring, given an expected number of offspring of both sampling methods. Implement a GA with inversion and test it on Royal Road function R1. Is the performance improved Design a fitness function on which you think inversion will be helpful, and compare...
Grammatical Encoding
The method of grammatical encoding can be illustrated by the work of Hiroaki Kitano 1990 , who points out that direct-encoding approachs become increasingly difficult to use as the size of the desired network increases. As the network's size grows, the size of the required chromosome increases quickly, which leads to problems both in performance how high a fitness can be obtained and in efficiency how long it takes to obtain high fitness . In addition, since direct-encoding methods explicitly...
Limitations of Static Schema Analysis
A number of recent papers have questioned the relevance of schema analysis to the understanding of real GAs e.g., Grefenstette 1993 Mason 1993 Peck and Dhawan 1993 . Here I will focus on Grefenstette's critique of the Static Building Block Hypothesis. The following qualitative formulation of the Schema Theorem and the Building Block Hypothesis should now be familiar to the reader The simple GA increases the number of instances of low-order, short-defininglength, high-observed-fitness schemas...
Evolving Crossover Hot Spots
A different approach, also inspired by nature, was taken by Schaffer and Morishima 1987 . Their idea was to evolve not the order of bits in the string but rather the positions at which crossover was allowed to occur crossover hot spots . Attached to each candidate solution in the population was a second string a crossover template that had a 1 at each locus at which crossover was to take place and a 0 at each locus at which crossover was not to take place. For example, 10011111 00010010 with...
Rank Selection
Rank selection is an alternative method whose purpose is also to prevent too-quick convergence. In the version proposed by Baker 1985 , the individuals in the population are ranked according to fitness, and the expected value of each individual depends on its rank rather than on its absolute fitness. There is no need to scale fitnesses in this case, since absolute differences in fitness are obscured. This discarding of absolute fitness information can have advantages using absolute fitness can...
Nextascent hill climbing NAHC
Choose a string at random. Call this string current-hilltop. For ifrom 1 to l where lis the length of the string , flip bit i if this results in a fitness increase, keep the new string, otherwise flip bit i back. As soon as a fitness increase is found, set current-hilltop to that increased-fitness string without evaluating any more bit flips of the original string. Go to step 2 with the new current-hilltop, but continue mutating the new string starting immediately after the bit position at...
Sigma Scaling
To address such problems, GA researchers have experimented with several scaling methods methods for mapping raw fitness values to expected values so as to make the GA less susceptible to premature convergence. One example is sigma scaling Forrest 1985 it was called sigma truncation in Goldberg 1989a , which keeps the selection pressure i.e., the degree to which highly fit individuals are allowed many offspring relatively constant over the course of the run rather than depending on the fitness...
Genetic Algorithms And Traditional Search Methods
In the preceding sections I used the word search to describe what GAs do. It is important at this point to contrast this meaning of search with its other meanings in computer science. There are at least three overlapping meanings of search Search for stored data Here the problem is to efficiently retrieve information stored in computer memory. Suppose you have a large database of names and addresses stored in some ordered way. What is the best way to search for the record corresponding to a...
Evolutionary Reinforcement Learning
A second computational demonstration of the Baldwin effect was given by David Ackley and Michael Littman 1992 . Their primary goal was to incorporate reinforcement learning an unsupervised learning method into an evolutionary framework and to see whether evolution could produce individuals that not only behaved appropriately but also could correctly evaluate the situations they encountered as beneficial or dangerous for future survival. In Ackley and Littman's Evolutionary Reinforcement...
FitnessProportionate Selection with Roulette Wheel and Stochastic Universal
Holland's original GA used fitness-proportionate selection, in which the expected value of an individual i.e., the expected number of times an individual will be selected to reproduce is that individual's fitness divided by the average fitness of the population. The most common method for implementing this is roulette wheel sampling, described in chapter 1 each individual is assigned a slice of a circular roulette wheel, the size of the slice being proportional to the individual's fitness. The...
Overview Ksz
As genetic algorithms become more widely used for practical problem solving and for scientific modeling, increasing emphasis is placed on understanding their theoretical foundations. Some major questions in this area are the following What laws describe the macroscopic behavior of GAs In particular, what predictions can be made about the change in fitness over time and about the dynamics of population structures in a particular GA How do the low-level operators selection, crossover, mutation...
Thought Exercises 1
Using the function set AND, OR, NOT and the terminal set s s0, s 1 , construct a parse tree or Lisp expression that encodes the r 1 majority-rule CA, where s1 denotes the state of the neighborhood site i sites away from the central cell with indicating distance to the left and indicating distance to the right . AND and OR each take two arguments, and NOT takes one argument. Figure 2.26 Results of Chalmers's experiments testing the effect of diversity of environment on generalization ability....
Boltzmann Selection
Sigma scaling keeps the selection pressure more constant over a run. But often different amounts of selection pressure are needed at different times in a run for example, early on it might be good to be liberal, allowing less fit individuals to reproduce at close to the rate of fitter individuals, and having selection occur slowly while maintaining a lot of variation in the population. Later it might be good to have selection be stronger in order to strongly emphasize highly fit individuals,...





