Applications of Genetic Algorithms
A Genetic Algorithm may find a good enough answer for a variety of search problems that are too large to be solved in a reasonable time. The algorithm does this by generating candidate solutions, which an application evaluates and scores for acceptability. This is called determining their fitness. The GA crosses and mutates the fittest solutions in an attempt to better them and so builds a population of the best answers to the problem.
Here are three such applications.
The Travelling Salesman
The Environment
This is the simplest of all ordering, or combinatorial, problems. A salesman must visit some cities in a journey, or tour, of the shortest distance. This is similar to commercial software that finds the shortest or fastest route between two locations. There are many standard problems in the literature: those in Karg and Thompson [1964] are often investigated, such as their Ten City Problem. More ambitious problems contain 33, 42 and 57 cities in the United States.
Another good test is a lattice, reported in Suh and Van Gucht [1987]. This can have any number of cities forming a rectangle, and the rough length of the shortest tour is easy to work out. If either side of the rectangle has an even number of cities, the shortest length is the number of cities in the lattice multiplied by the distance between them. A tour can always be constructed. The 25-city problem is not so easy to calculate, although the route puts an easily constructed upper limit on the shortest distance.
The Problem
The GA must produce a list of cities such that each city appears once and once only. This is an important constraint, and if it is violated, the list is illegal. Desirable constraints are rarely considered in this problem, but may include the time of the journey, priorities for cities and so on. The time may not be solely dependent on the length for many reasons: if public transport is used; if delays are known to occur in certain cities, such as at rush hours; etc.
The Ten City Problem has a search space of about 180,000 permutations from a fixed city of departure. Tours that differ only in their direction are treated as the same solution, although any extra constraints may discriminate between them. An enumerative search on even a modest computer should find a solution to this one, and it is more often used as a test of techniques. The 25-City Lattice however has roughly 3×1023 possible tours. More sophistication is needed here.
Web Search Problems
Search, for text and increasingly for image or video, on Google and other search engines is a problem that a GA could tackle. Finding pages that match words or phrases (collectively keywords) is not an issue. Sifting quality pages from the morass of possible candidates is. Too often directories, shopping sites and plain old link farms occupy the pole positions. We want well-written, solid information.
This is a fuzzy problem that only a human can judge, at the moment. So the job of the algorithm is to produce candidate strings of keywords for input to the search engine. The human guinea-pig then scores the strings based on the relevance of the resulting pages. This fitness function for the GA is the subjective opinion of the searcher rather than a hard-coded routine. (We can see how easy it is to write such an automatic function for the Travelling Salesman, for instance.)
An interesting corollary of this search is the teasing out of keywords to place in a website to boost its likelihood of being found, especially by the right people. The heuristics that the engines use are closely guarded secrets but they are just that: hard-coded heuristics, and the GA can at least explore the keyword part of them. It can’t do anything about other parts such as page rank, back-links and so on.
Searching for keywords is an instance of what I term a manual genetic algorithm. In such a system the user defines the variables that constitute the problem and then responds to each solution the GA tries. So the variables here would be sets of keywords and the solutions, combinations of them.
The toolkit as it stands has a menu entry for defining integer variables. Another one is needed for… the temptation is to call them strings but that’ll get confusing with the algorithm’s own internal strings, so let’s call them phrases.