Wednesday, May 13, 2009

Revision Notes: Evolutionary Computation

Introduction

Principles of Evolutionary Computing: · Evolution – Problem solving technique · Environment – The problem that is to be solved · Individual – A candidate solution · Fitness – Degree of quality Applications of Evolutionary Computing: · Optimization (Travelling Salesman · Modelling (Stock market prediction) · Simulation (Ecology) Evolutionary Algorithm Fitness Landscape Free Optimization Problem Genetic Programming Genotype Learning Classifier System Local Search Mutation Memetic Evolution No Free Lunch Theorem This addresses the efficiency of problem-solving procedures with respects to the specifics of the problem domain stating that on average two algorithms will perform well over all possible problem domains. Therefore a problem solving technique which incorporates knowledge of a specific problem domain may perform much better than other generic problem solvers, but with this mind, the technique may perform much worse on other problems. Objective Function Parameter Control Although static setting of EA parameters is simple, both the candidate solutions evaluated and the final solution is fairly stochastic in nature and so allowing parameters to change throughout the EA run allows suitable self-tuning to the current fitness landscape. Deterministic Parameter is altered independent of the performance of the EA and often as a function of time or epochs. There is no feedback from EA itself and so there lies the possibility of driving the EA away from an optimal solution. Mutation Control: Is the current generation, is the total number of generations allowed Adaptive Changes in parameters are often a function of both the current and historic performance of the EA i.e. poor performance, high mutation/crossover rates. Self-Adaptive Parameter is encoded into candidate chromosomes and thus is subject to the recombination operators like the solutions they encode.

Evolutionary computation uses principles from biological evolution which occurs by natural selection. Algorithms in this field maintain a population of candidate solutions which are iteratively refined by recombining and mutating existing solutions. By tending to select individuals which maintain a high degree of fitness, optimal characteristics will develop
Individuals can only populate if they survive to adulthood The fitness of an individual is a function of its environment Survival is probabilistic and not guaranteed

Genetic Algorithm

Genotype

The genotype encodes an internal representation of the candidate solution. Phenotypic mapping is required in order to produce a candidate solution. A suitable genotypic representation allows the encoding of all possible candidate solutions although it is possible to remove unfeasible solutions by careful design. Additional attention must be paid with regards to how the mutation and recombination operators interact with the genotype. Poor genotypic representation can lead to phenotypic mapping inefficiencies, unrestricted search space representation and the allowance of invalid candidates to exist. It is therefore important for significant study of the problem domain to take place.




Binary - Useful for representing natural binary or boolean (descision) functions but also for high-level data structures although complex mapping maybe required. Positions along the chromosome may have certain significance with regards to how they encode the phenotype and so it's recommended specific schemes like Gray Coding are employed to minimise the hamming distance.

Integer - Useful for reducing genotype-phenotype abstraction and also simplifying representation for enumerated types (e.g. ordinal sets). Genetic operators for recombination and mutation must be defined which maybe more complex than simply splicing sections of the chromosome together.

Real - Useful for representing continuous values such as vectors. One should respect that computers can only represent a finite range of real values and so not all possible candidate solutions maybe represented. As with integer representations, genetic operators specifically designed for dealing with real numbers are required.

Phenotype

For the purposes of discussion, the specifics of the phenotype don't matter, all that should be known is that the phenotype exists as an 'uncompressed' version of the genotype to which the environment directly interacts with. In other words it is the materialisation of a candidate solution.

Recombination Operators

The purpose of the recombination operators is to produce new offspring with the secondary aim of taking fitter characteristics from two or more parents to produce a child which outperforms them. This process is generally quite probablistic and greatly dependant on the method of population selection employed but has a tendancy to inhibit the reproductive capacity of sub-optimal candidates increasing the population fitness over several generations. Quite simply a child is created by merging the chromosomes of contributing parents using several techniques (although a specific implementation may require adaptions of these) :



N-Point Crossover - N partitions are located along the chromsome of each parent and then regions between partitions are swapped. To maintain chromosome length, the partition points are at the same location of each contributing chromosome.

Cut and Splice - This is a special case of N-Point Crossover where N = 1 and the partion points are not equal across the parental chromosomes. This method may not work for all implementations since it alters the length of the chromosome possibly rendering it invalid.

Uniform Crossover - Rather than swapping regions of the chromosome like the previous two methods, the uniform crossover swaps each position along the chromosome with probability p.

Cut and Splice - This is a special case of N-Point Crossover where N = 1 and the partion points are not equal across the parental chromosomes. This method may not work for all implementations since it alters the length of the chromosome possibly rendering it invalid.
Population Selection

Evolutionary Programming

Genetic Programming

Hybrid Algorithms I: Ant Colony Optimization
Hybrid Algorithms II: Swarm Intelligence

Motivated by the behaviour of insects, fish and birds where each individual dynamically optimizes its position within the swarm and the environment. Although the natural occurrence of swarm intelligence use positions and velocities rooted in two dimensional space, for practical usage they can represent any n-dimensional parameter space. Bi-directional communication between each individual and the swarm as a whole, allows the propagation of optimal solutions (positions). Candidate solutions are comprised of the following: · individuals current position and velocity · best solution it has seen so far by the individual · best solution in the individual’s neighbourhood · best solution in the entire swarm It should be noted that a random component helps the exploration of solution space remains stochastic. for k = 1 to number of iterations to dofor I = 1 to number of particles n dofor J = 1 to number of dimensions m doR1 = uniform random numberR2 = uniform random numberV[I][J] = w*V[I][J]+ C1*R1*(X_lbest[I][J] - X[I][J])+ C2*R2*(X_gbest[J] - X[I][J])X[I][J] = X[I][J] + V[I][J]enddoenddoenddo Case Study: Boids Boid was an application used to simulate flocking behaviour in birds which comprised of three basic rules for optimization: · Flock Centralization – Each Boid seeks to position itself in the middle of neighbouring Boids. · Collision Avoidance – Each Boid seeks to distance its self from all other Boids. · Velocity Consistency – Each Boid seeks to match the velocities of neighbouring Boids. Further advanced rules were also proposed to address the following conditions: · Wind · Positional Affinity · Velocity Constraints · Search Space Bounding · Perching · Anti-Flocking

Hybrid Algorithms III: Learning Classifier Systems

Zeroth Level Classifier
XCS Classifier

Wednesday, February 4, 2009

Snowdays

Those who live on the south of the England, will no doubt be aware of the abundance of white stuff which deposited itself at the begining of this week.

I thought I'd share with you a few fantastic photos taken by my good friend Phil around the University of Reading.




Sudoku Genetic Algorithm: Introduction

What Is Sudoku?
Sudoku is a recently introduced, Japanese number puzzle where the objective is to fill a 9x9 grid so that each row, column, and 3x3 sub grid, contains numbers 1 through 9. The grids are usually partially filled and so it's the job of the solver to complete the remaining empty cells, without violating these. Sudoku problems range from easy to hard, where easy grids have more than one possible solution, and hard, where the grids have only one unique solution.


Sudoku puzzle with solution numbers marked in red. Source: Wikipedia

Why Use A Genetic Algorithm?

...because curiosity killed the cat, that's why!

Genetic Algorithms to which I'll now abbreviate to GA, is an interesting means to solving a problem. Though in this domain, they certainly don't provide a performance increase, they're often used in engineering to solve problems to which the solution space is so vast, there exists no algorithmic approach and Monte Carlo methods simply skim the surface of all possible solutions.

Gross Over-Simplification:
We maintain a population of ideally diverse solutions, to which each individuals fitness (effectiveness of solving a given problem) is evaluated. The highest ranking individuals are chosen to seed the next generation, with low ranking individuals discarded. This process repeats until some given threshold has been breached. This process effectively refines good solutions in an attempt to increase their fitness, so much so that they solve the problem. Note that this is a Gross Over-Simplification and much work goes into the degree of seeding, mutating solutions and even the evaluation function... GAs are not a one-stop-shop to solve world hunger.

In general, we use a GA to solve Sudoku puzzles because we can and it provides a more holistic approach to solving a problem.

In this series of posts, we'll explore the various aspects of implementing a Sudoku GA including the pitfalls and misconceptions of GAs in general.