Data-driven meme selection

We explored the idea of memetic computation briefly in the previous article. We learned that Canonical Memetic Algorithms (CMA) need a human expert to select an appropriate meme and that it is a major drawback in our pursuit of Artificial General Intelligence (AGI). In this article, we will discuss how we can use data to select memes automatically.

As we saw in the previous article, the role of the human expert is to select the most suitable meme from a pool of memes. The expert makes the selection based on their knowledge. They must have acquired their knowledge both through their own experience and that of other experts. Experience is essentially the data one gathers through observation. Consequently, by collecting data on the performance of different memes, we will be able to automatically pick the best meme. We call this data-driven meme selection.

There are two common ways of doing this. One is sub-problem decomposition, and the other is reward-proportionate roulette wheel selection.

Sub-problem decomposition

In sub-problem decomposition, we begin by randomly choosing a meme and using it on an individual. Then we compute the difference between the fitness score of the individual before and after applying the meme. We call this difference a reward. We store this reward value along with the individual and the meme in a database.

As we apply this process to several individuals across multiple generations, we will have accumulated a substantial amount of data. Then, instead of randomly selecting a meme, we can select the most appropriate meme based on this data. To that end, when we select a meme to apply to an individual, we first need to find individuals in the database who are closer to the concerned individual. In other words, we need to find individuals in the neighborhood of the concerned individual in the search space. We can do this by using any appropriate distance measure such as Euclidean distance.

Once we find the neighbors of the individual, we then pick the neighbor with the highest reward value. Next, we choose the meme applied to the neighbor and apply it to the individual. The intuition here is that the meme that brought about the highest fitness improvement among the neighbors is likelier to have a similar effect on the individual. The number of neighbors we should consider here is a tunable parameter.

Reward-proportionate roulette wheel selection

The reward-proportional roulette wheel selection also makes use of the reward data. However, in contrast to sub-problem decomposition, here, we store the reward data only against the meme. Accordingly, every time we apply a meme, we add the reward values, so the reward value of a meme accumulates over time. Once we have enough data, we select a meme with a probability proportional to its accumulated reward. The higher a meme’s reward, the higher the probability of selecting it.

Thus, we engage in simultaneous learning and optimization during a run. As we progress, we not only get closer to the solution, but we also learn to do it better. This is more in line with the idea of AI.

Evolvability measure

As we discussed in the previous article, different combinations of memes and EA strategies such as crossover and mutation strategies can produce different outcomes. For example, we can perform mutations by randomly drawing from different probability distributions. Subsequently, we apply a meme to a mutated individual in the next iteration. So, different combinations of mutations and memes can produce different outcomes. In other words, both the meme and the nature of mutation can influence eventual fitness improvement. To find the right combination, we use the evolvability measure.

To keep this article simple, let me gloss over the mathematics behind the evolvability measure. But if you are interested in the detailed mathematics, please refer to section 3.2.1 of the book Memetic Computation. What we need to know is that for every combination of memes and EA strategies, we calculate the evolvability measure and apply the combination that produces the highest value.

Meme complexes

Until now, we have only been focusing on applying one meme to an individual. In contrast, memeplexes apply a chain of memes to an individual. For instance, we first apply one meme to an individual, then another, and so on. We choose the first meme and the subsequent memes to apply by using weight values.

We first form a memetic network topology by linking memes. The chain follows the links. That is, after selecting a meme, we select the next meme from the pool of memes that have links to the first meme. Every meme and the link between them have a weight value. Our choice of memes is proportional to these weight values. For example, we will select a meme with a probability proportional to its weight and then choose the next meme with a probability proportional to the weight of the link from the first meme.

We learn these weight values starting from an unbiased initialization. To calculate the weight of a meme, we accumulate the immediate reward upon applying the meme to an individual. To calculate the link weight, we calculate the reward after executing the whole chain of memes, divide it by the length of the chain, and accumulate this value over generations.

Multi-surrogacy

In certain cases, the fitness function of EAs can be expensive and time-consuming. For instance, in some engineering problems, we calculate the fitness of an individual by simulation which can take several hours. In such cases, we replace the expensive fitness function with a surrogate, which is a cheaper fitness function that can predict the fitness of an individual. Since this involves prediction, we generally use Machine Learning (ML) or regression models.

We can initially run the expensive fitness function for a few generations, generate data, train the ML model on that data, use the model to predict the fitness of individuals, select the most fit individuals, and evaluate their real fitness using the expensive fitness function. This way we can greatly conserve resources.

However, there can be many different ML models, and choosing a model among several others is akin to choosing one meme among several others. Instead of choosing one model, we can assign a weight value to each model and learn the weights over time so that the most suitable ones will automatically end up with higher weight values. We set the weight to be the inverse of the average generalization error over the dataset. So, the larger the error, the smaller the weight would be. As a result, models that perform better will eventually start exercising more influence on the predicted fitness score.

Even though we have now adopted a data-driven approach to choosing a meme, in all the above cases, we start our search for the solution from scratch. Although we learn as we proceed, this accumulation of knowledge is only applicable to the concerned problem. However, this is not how humans work. We solve problems and then use that experience to solve subsequent problems better and faster. So, to realize the idea of a meme completely, we should be able to transfer our experience solving one problem to another. This leads us to memetic automatons. In the next article, we will be discussing what memetic automatons are and how they work.

Leave a Reply

placeholder="comment">