Differential Evolution

Differential Evolution (DE) [3] is an Evolutionary Algorithm (EA) originally designed for solving optimization problems over continuous domains. It has a simple implementation yet a great problem-solving quality, which makes it one of the most popular population-based algorithms, with several successful applications reported.

From its original conception, DE was designed to fulfill some requirements that have made it particularly useful:

Ability to handle non-differentiable, nonlinear, and multimodal cost functions.
Parallelizability to cope with computationally intensive cost functions.
Ease of use: few control variables to steer the minimization. These variables should also be robust and easy to choose.
Good convergence properties: consistent convergence to the global minimum in consecutive independent trials.

Those interested in more detail can refer to the book by Price et al. [4].

Overview

The basic structure of a DE algorithm is represented below, of which the main operators and their respective control parameters will be described next.

general_de

Initialization

The algorithm starts by initializing a population based on a user-specified number of individuals \(N\) and the boundaries of each decision variable of the problem. In pymoo, the population size is parsed to algorithms when initialized in the argument pop_size, whereas boundaries are parsed when defining the problems. Each individual corresponds to a vector of optimization variables. A choice of \(N\) between 5 and 10 times the number of decision variables might be a good start.

How the individuals are sampled by pymoo algorithms is defined by the sampling argument parsed to the algorithms. By default pymoode adopts the Latin Hypercube Sampling implemented in pymoo. Random sampling is also usual in DE algorithms.

from pymoode.algorithms import DE
from pymoo.operators.sampling.lhs import LHS

algorithm = DE(pop_size=20, sampling=LHS())

Fitness assignemnt

Individuals are assigned a fitness value, based on their corresponding objective function values and possibly constraint values. Originally DE has no rule for constraint handling, which was later the focus of several articles. In pymoo, by default, fitness assigment considers that every feasible solution dominates infeasible ones and infeasible solutions are sorted by overall constraint violation.

Notice that the fitness assignment stage is performed by the algorithm’s survival operator in most pymoo algorithms. Until pymoode version 0.2.4, one could not choose the survival operator of single-objective DE and this stage would necessarily occur using the criteria described above.

Iterations

The population then iterates through successive generations until some stopping criteria are met. In each of these iterations, new trial vectors are produced by the algorithm’s mating operator from the class VariantDE using operations usually denoted in DE literature as mutation and crossover. A survival operator is necessary to truncate the population into its original size using criteria which usually differ between algorithms. Stopping criteria are usually based on improvements in the objective function and the number of generations.

Details on the survival mechanisms of each algorithm can be found in the Survival section.

Nomenclature remark

Notice that the operation defined in DE as mutation differs from the usual mechanisms of pymoo’s mutation operators. For compatibility purposes, one might add some pymoo genetic mutation operator to DE by includind the genetic_mutation argument when instantiating the algorithm. It will occurr right after the crossover operation. Possibly a repair operator might be also included which occurs after the genetic_mutation.

Variants

Different variants are proposed in the literature for DE operations. They are usually described as DE/x/y/z. In which x corresponds to the parent selection scheme, y to the number of difference vectors, and z to the crossover scheme.

In pymoode, the variant used is parsed as a string when instantiating the algorithms to the variant argument. By default, the usual ‘DE/rand/1/bin’ variant is adopted.

More details about the variants are described in the Mutation and Crossover.

algorithm = DE(pop_size=20, variant="DE/rand/1/bin")

Mutation

Mutation in DE is an operation that produces mutant vectors \(v_{i}\) to take part in the crossover stage with each parent of the original population \(x_i\).

Different possibilities of how to sample vectors from the original population to perform this operation and how many difference vectors are created exist in the literature, of which the usual DE/rand/1 is probably the most popular.

\[\begin{equation} v_{i} = x_{r1} + F (x_{r2} - x_{r3}) \end{equation}\]

In which F is a scalar usually denoted scale factor or mutation parameter.

In pymoode, due its nature, DE mutation is implemented as a pymoo Crossover operator, performed by instances of the class pymoode.operators.dex.DEM. The parent selection is performed by pymoode.operators.des.DES instances.

Mutation variant

The parent selection schemes available are:

‘ranked’
‘rand’
‘best’
‘current-to-best’
‘current-to-best’
‘current-to-rand’
‘rand-to-best’

Variants that include best should be used only in single-objective DE, as the definition of the best solution in a population compromises diversity in multi-objective optimization. The variants DE/best/n strongly reinforce elitism, and some operator to help create diversity of solutions might be helpful. For instance, parsing a polynomial mutation operator in the genetic_mutation argument can improve performance.

from pymoo.operators.mutation.pm import PM

algorithm = DE(pop_size=20, variant="DE/best/1/bin", genetic_mutation=PM())

The variants rand and ranked start sampling parents in a similar manner. However, in ranked the parents selected are sorted by dominance criterion such that the one with best berformance is the base vector, and the worst performance is the origin of the difference vector.

Any number of difference vectors might be used, although 1 or 2 are more frequently used.

The scale factor

The diversity produced by the mutation is governed by the parameter \(F\). To reinforce exploration, use higher values; whereas for exploitation, use lower values.

In pymoode one might parse the argument F as a tuple when instantiating operators. In this case, random uniform distribution dither is used. It means that to each mutant vector \(v_i\) a value of \(F\) is sampled from a random uniform distribution with limits defined by the tuple.

Although defined to be in the range (0, 2], usually one would sample F in the range [0, 1]. For single-objective DE, some authors recommend avoiding too low values (< 0.2), which would reduce diversity. However in case of multi-objective optimization the diversity is usually naturally preserved by the conflicting nature of objectives and smaller values might be adopted.

algorithm = DE(pop_size=20, F=(0.3, 0.7))

Jitter

Jitter is an operation that adds rotation to difference vectors, controled by the parameter gamma. It is defined in the range [0, 1], although one would hardly choose values greater than 1e-1.

Each component of difference vectors is multiplied by the following equation.

\[\begin{equation} 1 + \gamma (\text{rand}[0, 1] - 0.5) \end{equation}\]

Jitter can be useful to increase diversity when small populations are adopted. But using small values such as 1e-4 or 1e-3 is often adopted.

algorithm = DE(pop_size=20, gamma=1e-3)

DE Repair

From the mutation equation, one might notice that mutant vectors produced might lie outside problem boundaries. Therefore, after mutation, a repair strategy is adopted that considers the mutant vectors, base vectors of the mutation (\(x_{r1}\) in case of DE/rand), and the problem boundaries. It is defined by the de_repair argument.

The possibilities are:

‘bounce-back’: the repaired components lie in a random value sampled between the base vector and problem boundaries.
‘midway’: the repaired components lie in the middle of the segment between the base vector and problem boundaries.
‘rand-init’: the repaired components are randomly samples between problem boundaries.
‘to-bounds’: the repaired components are forced to the violated problem boundaries.

Callables might also be parsed in the de_repair argument. In this case, it has the form fun(X, Xb, xl, xu) in which X contains mutated vectors including violations, Xb contains reference vectors for repair in feasible space, xl is a 1d vector of lower bounds, and xu a 1d vector of upper bounds.

algorithm = DE(pop_size=20, de_repair="midway")

Crossover

The crossover in DE is an operation performed between individuals in the original parent population \(x_i\) and the corresponding mutant vectors \(v_i\). It is governed by the CR parameter, defined in the range [0, 1].

The two alternatives available in pymoode are:

“bin”: binomial crossover
“exp”: exponential crossover

To reinforce mutation, use higher values for CR. To control convergence speed, use lower values.

Naturally, when adopting “exp” mutation fewer attributed of child vectors are inherited from mutant vectors, which is enphasized according to the number of decision variables. Therefore it might be useful to use larger CR values to obtain similar results.

Price et al. [4] analyzed several aspects of CR choice. Usually, objective functions that perform well with low values are decomposable - can be written as a sum of one-dimensional functions, while those that require values close to one are not. Choices of CR close to one are thus associated with the rotational invariance required for the algorithm and depend on the objective functions.

Those interested in a more detailed analysis on the impact of crossover in DE might refer to the article of Zaharie [15].

algorithm = DE(pop_size=20, CR=0.5)

Usage

Please refer to the Usage section for examples.