For evolutionary algorithms, noise is something that happens when you cannot get a fitness function to return the same value twice in a row. It is a mainstay of games, but it can be found also in industrial processes and in things like neural nets. We have been working despite it many times usually by doing several evaluations and averaging, but this is not really the best way of dealing with it. Since the shape of noise is not known in advance, in the paper presented at the ECTA 2014 conference we proposed a new method for dealing with it: using statistically sound comparisons, namely Wilcoxon. The paper is entitled “Studying and Tackling Noisy Fitness in Evolutionary Design of Game Characters”, and here’s the abstract.
In most computer games as in life, the outcome of a match is uncertain due to several reasons: the characters or assets appear in different initial positions or the response of the player, even if programmed, is not deterministic; different matches will yield different scores. That is a problem when optimizing a game-playing engine: its fitness will be noisy, and if we use an evolutionary algorithm it will have to deal with it. This is not straightforward since there is an inherent uncertainty in the true value of the fitness of an individual, or rather whether one chromosome is better than another, thus making it preferable for selection. Several methods based on implicit or explicit average or changes in the selection of individuals for the next generation have been proposed in the past, but they involve a substantial redesign of the algorithm and the software used to solve the problem. In this paper we propose new methods based on incremental computation (memory-based) or fitness average or, additionally, using statistical tests to impose a partial order on the population; this partial order is considered to assign a fitness value to every individual which can be used straightforwardly in any selection function. Tests using several hard combinatorial optimization problems show that, despite an increased computation time with respect to the other methods, both memory-based methods have a higher success rate than implicit averaging methods that do not use memory; however, there is not a clear advantage in success rate or algorithmic terms of one method over the other