Evolutionary algorithms are heuristic search and optimization algorithms that imitate the natural evolutionary process. This article puts evolutionary algorithms into context and explains their basic functionality. Consequently, we first examine why we prefer heuristic algorithms over exact algorithms. With that understanding, we then try to understand how a basic evolutionary algorithm works.
Searching and optimization problems involve finding an optimal solution to a problem in a search space. If the search space is small enough, then regular search algorithms such as breadth-first and depth-first search will suffice. However, when the search space becomes larger, such exact algorithms won’t be able to find an optimal solution within an acceptable amount of time. In such cases, we use heuristic algorithms. These algorithms employ intelligent guesses to get close to the optimal solution in a large search space.
For example, let’s assume there are more than a million different routes from city A to city B via different cities. To find the fastest route between the two cities, an exact search algorithm will try to compute the time taken via each route. However, given that there are more than a million routes, this might not be a very practical way of solving the problem. Instead, a heuristic algorithm will employ an intelligent guess to find a near-optimal solution within an acceptable time.
A greedy algorithm is a good example of a heuristic algorithm. A greedy algorithm simply takes the fastest available routes along its way until it reaches the destination. For example, from city A, the algorithm will pick the city it can reach the fastest next, and so on until it reaches city B. However, we cannot guarantee that this will find the fastest route between the two cities. For instance, let’s look at the example in Figure 1. From city A, the fastest route is to city D. However, if you choose the route to city D, you may end up taking 7 hours longer to reach city B than you would if you chose the route to city C.
Consequently, heuristic algorithms cannot find the most optimal solutions to complex problems. However, their strength is their speed as an accurate algorithm that takes an unacceptable amount of time is as good as not solving at all. Accordingly, we consider lesser accuracy an acceptable compromise to make.
Evolution in nature
Evolutionary algorithms are a type of heuristic algorithm that use the concept of evolution found in nature to find near-optimal solutions to complex optimization problems. Let’s see how these algorithms work in general. But before that let’s understand how evolutions work in nature.
In nature, a population comprises several individuals belonging to a species. Different combinations of genes determine an individual’s characteristics. Some of these characteristics are beneficial while others are detrimental. Individuals with more beneficial characteristics survive and pass on their beneficial genes to the next generation. This way the beneficial genes become more dominant in a population. It is the natural environment that decides which genes are beneficial. In other words, nature selects the fittest genes.
As the natural environment changes, individuals may need new characteristics to survive. To produce new characteristics, you need new gene combinations. Nature produces new gene combinations through mutation and sexual reproduction. Mutation is any random ‘error’ that occurs in genes when cells multiply. Sexual reproduction is when genes from two individuals combine to produce new characteristics. Accordingly, mutation and combinations of genes from individuals produce variations in the population. Nature then selects the fittest variation. We call this whole process evolution.
For instance, a genetic mutation may have made the neck of some giraffes in a population longer. In an environment where leaves are found in very tall branches, this group of giraffes would have had a better chance of survival, resulting in their mutated genes becoming dominant in the population. Thus, giraffes evolved to have longer necks.
Now, let’s see how we can use this knowledge to develop an evolutionary algorithm. We initially start with a random set of routes from city A to city B. We call each one of these random routes an individual. These individuals together form a population. In our example, we can consider every city in the route as a gene. For instance, a random individual in the population can have the following genes: A-E-H-G-B. As we can see, genes make up the probable solution to a problem. Our population will have several such genetic sequences. Our goal is to find the most optimal sequence.
To that end, we perform what we see in nature. First, we try to produce new individuals and then select the most fit individuals. We can use both mutation and sexual reproduction to produce new individuals. In evolutionary algorithms, we perform sexual reproduction by performing crossovers. But before performing crossovers, we decide on the set of individuals who should undergo this process. We make this decision based on how fit these individuals are. We consider an individual that can give us a better solution fitter. In our case, a better solution is a faster route. So, we can decide the fitness of individuals by computing the total time of the journey. We call the function that computes this fitness score the fitness function.
Once we find the fitness score of every individual, it’s time to select the parent individuals who would engage in crossovers from the population. Intuitively, selecting the most fit individuals should result in offspring with high fitness. However, as we saw with the greedy algorithm example, we may end up with a suboptimal solution if we get too greedy.
To get a better understanding, let’s look at these two individuals from our example: A-E-F-K-B and A-D-H-I-J-B. Let’s assume individual 1 takes 10 hours and individual 2 takes only 8 hours. We may be tempted to pick individual 2 over individual 1. However, note that the F-K-B genes of individual 1 take only 2 hours. These genes, if they crossover to another individual, may produce a fitter individual. Hence, by rejecting this individual, we are losing out on good genes. This can result in a loss of genetic diversity in the population, and we may end up with a local optimum.
At the same time, we also can’t afford to randomly select individuals since that essentially means that we are performing a random search. So, we need to find effective parent selection strategies to select parents for crossover that will provide a tradeoff between conserving diversity and moving towards the solution faster. Some of the popular strategies are tournament selection, roulette-wheel selection, and random selection. Roulette-wheel selection, for instance, randomly selects individuals after assigning each individual a probability proportional to their fitness. So, it ensures fitter individuals have a higher probability of getting selected, while not rejecting less fitter individuals entirely.
Crossovers and mutations
Once we select the parents, we perform crossovers. As the name implies, during crossovers, individuals exchange parts of them with a random pair. Several different crossover strategies exist such as one-point crossover and uniform crossover. In one-point crossover, two individuals split into two and exchange their halves to produce offspring.
Following crossovers, we mutate individuals by randomly modifying a gene of some of the individuals. This increases the diversity of the population and allows us to produce new solutions. However, too many mutations can once again veer us in the direction of a random search, so we need to perform mutations in moderation.
Subsequently, after we produce new individuals, we need to decide on who will survive to the next generation. We do this to make sure the number of individuals across generations remains constant. We can do this using several strategies. One strategy is to allow all offspring (new individuals) to pass to the next generation. Another strategy is to select the fittest individuals from the entire population (new and old individuals) for the next generation, which means some of the offspring may die. Once we complete this, we iterate over this whole process again. We continue to iterate until either the average fitness of the population stops improving significantly or we reach a specified number of iterations.
In evolutionary algorithms, we start from many different random places in the search space and slowly evolve toward the solution in the search space. With every iteration, what begins as a random search starts to gain a sense of direction before finally converging close to the optimal solution.
For better understanding, we summarize a generic evolutionary algorithm using the pseudocode below.
Initialize the population pg (g denotes the generation and it is 0 in the beginning)
For each individual i in pg
Calculate the fitness fi
Select parents q from pg based on fi
Perform crossover and mutations on q to produce offspring population o
Select who will make it to the next generation pg+1 from o, q, and (pg - q)
Until the termination condition is met
In this article, we looked at the relevance of heuristic algorithms as exact algorithms cannot solve complex problems within an acceptable time. Then, we discussed how evolution works in nature briefly before understating how evolutionary algorithms work. Evolutionary algorithms start with a random set of individuals, select individuals to become parents via strategies such as roulette-wheel selection, tournament selection, etc., perform crossovers and mutations to produce offspring, and select individuals who will make it to the next generation. This process is iterated until the algorithm meets a termination criterion. Although evolutionary algorithms are capable optimizers, they do have their limitations. Let’s see how we address these limitations in subsequent articles.