Comparison of Two Interval Models for Interval-valued Genetic Algorithm

The author previously proposed an extension of genetic algorithm. The proposed method extends the processes of GA to handle interval numbers as genotype values so that GA can be applied to interval-valued optimization problems. Our IGA can employ either of two interval models, the lower and upper model or the center and width model for specifying genotype values. Ability of our IGA in finding solutions may depend on the model. In this paper, the author compares the two models to investigate which model contributes better for our IGA to find better solutions more efficiently. Application of our IGA is evolutionary training of interval-valued neural networks. A result of preliminary study shows that the LU model contributes better than the CW model. In the final paper, the author will fully report experimental results and compare the two models based on the results.


Introduction
Genetic algorithm (GA) [1], which is an instance of evolutionary algorithms [2], employs real numbers (or bit strings) as genotype values for solving real-valued optimization problems.The author previously proposed an extension of GA.The proposed method [3] extends the processes of GA to handle interval numbers as genotype values so that GA can be applied to interval-valued optimization problems.We have applied the extended interval-valued GA (IGA) to the evolution of interval-valued neural networks (INN [4]) and showed that IGA could evolve INNs which model interval target functions well despite that the training (evolution) of the INNs was not supervised [3].
An interval value can be specified by its lower and upper limit values or its center and width values, and thus our IGA can employ either of two interval models, the lower and upper model (LU) or the center and width model (CW) for specifying genotype values.Ability of our IGA in searching solutions may depend on the model.In this paper, the author compares the two models to investigate which model contributes better for our IGA to find better solutions more efficiently.Application of our IGA is the same as that in our previous paper [3], i.e., evolutionary training of the INNs [5].

Neural Networks with Interval Weights and Biases
The INN employed in our research is the same as in the literature [4], which is a three-layered feed forward NN with interval weights and biases.Fig. 1 shows its structure.An INN receives an input real vector x and calculates its output interval value O (for simplicity, the output layer includes a single unit) as follows [4]: (1) Hidden layer: O j =f(Net j ). (3) Output layer: O=f(Net).
(5) In Eqs.( 1)-( 5), x i and o i are real values, while Net j , Net, W j,i , W j , Θ j , Θ, O j and O are interval values.f(x) is the unit activation function which is typically the sigmoidal one: f(x) = 1/(1+e -x ).f(x) maps an interval input to an interval output as illustrated in Fig. 2.
For the feed-forward calculation of the INN, the interval arithmetic [6] is utilized.Let us denote two closed intervals as A and B, where The INN includes mn+m weights (i.e., nm weights between n input units and m hidden units, and m weights between m hidden units and an output unit) and m+1 biases (= the total number of units in the hidden and output layers).Thus, the INN includes nm+2m+1 interval variables in total.Our IGA handles these interval variables as a genotype V = (V 1 , V 2 , …, V D ) where V i is an interval and D = nm+2m+1.Each V i can be specified by its upper and lower real values or by its center and width: and v i w denote the upper, lower, center and width of V i respectively.

Genetic Algorithm with Interval-valued Genotypes
Our IGA [3] includes the same processes as those in the ordinary GA (Fig. 3).Processes of initialization of population, fitness evaluation and reproduction are extended so that these processes can handle interval-valued genotypes.

Initialization of Population
In the initialization process, V 1 , V 2 , …, V P are randomly initialized.Because the elements in V j (i.e., V j,1 , V j,2 , …, V j,D ) are weights and biases in an INN in this research, smaller absolute values are preferable as initial values of V j,i parameters.Thus, the initial values for V j,i are randomly determined as N(0,p) or U (-p,p), where p denotes a small non-negative number, N(0,p) denotes a random number which follows the normal distribution, and U(-p,p) denotes a uniformly random number within the interval [-p,p].In the case of LU model, two values are sampled per V j,i : the smaller/larger one is set to v j,i L / v j,i U respectively.In the case of CW model, two values are sampled per V j,i : one of the two values is set to v j,i c , and the absolute value of the other is set to v j,i w .

Fitness Evaluation
To evaluate fitness of an INN as a phenotype of the corresponding genotype  For example, in a case where the INN is applied to controlling an automated robot system, some performance measure of the system can be used as the fitness score of the genotype corresponding to the INN.Unlike the back propagation algorithm, EAs for training NNs do not need errors between output values of the NN and their corresponding target values, but simply need {V 1 , V 2 , …, V P } be ranked.Thus, the P genotypes are ranked based on their fitness scores.

Crossover
Let us denote two parent genotype as V a , V b and an offspring genotype as V z .V a and V b can be sampled from the population in the same manner as the ordinary GA.
In the case of employing the LU model, and v z,i L and v z,i U can be determined by applying the blend crossover [7] for the real GA: v z,i L is randomly sampled In the case of employing the CW model, ), and v z,i c can be determined by applying the blend crossover for the real GA: v z,i c is randomly sampled from the interval w < 0 then V z,i must be repaired so that v z,i w ≥ 0. A simple repair method is: v z,i w  0.

Mutation
Values in the offspring genotypes are mutated under the specified mutation probability.In our IGA, each offspring V is an interval vector (V 1 , V 2 , …,V D ) where V i is an interval specified by the two real parameters, [lower, upper] or (center, width).An element of the lower, the upper, the center or the width which is selected under the probability is mutated by being added (or replaced) with a random real value r to the current value where r is sampled from N(0,1) or [-p, p].After the mutation of V i , v i L may become > v i U or v i w may become < 0. Such invalid interval values are repaired by the same method applied in the crossover process.

Comparison of LU/CW Models For Interval Genotype Values In IGA
As described in 3.3, the constraints for the two interval parameters (i.e., the lower and upper values or the center and width values) are different, and thus the methods for modifying constraint-violating values are also different between the LU and CW models.This difference may affect the performance of IGA in searching solutions.To compare the performances between the two models, IGA with each of the two models is applied to the same problem.As the application problem, evolution of INNs is employed.IGA is challenged to evolve INNs which better model target interval functions.The target interval functions are the following function in this study, F(x) = [F(x) L , F(x) U ], where F(x) L and F(x) U denote the lower and upper limits of the interval function F(x).Fig 4 shows the shapes of the target interval functions F(x).
INN and IGA are designed as follows.

INN:
 #hidden units: 10 IGA:  #Total INNs evolved: 1,000,000  Population size and #generation: (100 and 10,000) or (500 and 2,000)  for the blend crossover: 0.5  Mutation probability: 0.01 for each of the intervals V i,1 , The total number of INNs evolved is set to the same value to fairly compare the two models.The number of generations is 10,000 (or 2,000) for IGA with 100 (or 500) solutions so that the total number of INNs evolved is constantly 1,000,000.Thus, a smaller number of repairs will be better in evolving INNs.Table 1 shows the mean and the standard deviation scores of the number of repairs in the 5 runs.For example, IGA with the LU model and the population size of 100 required 1.64E+6 repairs, while IGA with the CW model and the population size of 100 required 3.65E+5 repairs.
Fig. 2. Input-output relation of each unit in the hidden and output layers [4].Fig. 3. Processes in our interval-valued GA [3].


Selection: best 10 elite genotypes are copied to the next generation  Tournament size for sampling parent genotypes: 5% of the population size Figs.8-10 show the results with the target function F(x).

Fig. 8
Fig.8 (9) shows the output interval function of the best INN among the total 10,000,000 INNs (= [1,000,000 INN in each run] [5 runs] [population size = 100 or 500]) evolved by the IGA with the LU (CW) model.Figs. 8 and 9 reveal that both of the best INNs evolved with the LU and CW models fit to the target function F(x) very well.Fig.10 shows the error value of the best INN among each number of INNs evolved (e.g., 500,000 INNs are evolved in total at the 5,000th generation).In Fig.10, "LU (100)" denotes the result with the LU model and the population size of 100."LU (500)", "CW (100)" and "CW (500)" denote their results in the same manner as "LU (100)".The error values are averaged ones over 5 runs.Fig.10 revealed that the CW model was better than the LU model for GA with the population size of 100 and 500.To investigate the reason why better INNs could be evolved with the CW model, the author counted the number of repairs for the interval weights and biases to meet the interval constraints.As described in 3.3 and 3.4, v z,i L must not be larger than v z,i U for the LU model and v z,i w must not be negative for the CW model.In the crossover and mutation processes, if new values of v z,i L , v z,i U or v z,i w violate the constraints then the new values are repaired.Such repairs may interfere with the evolution of INNs because the repairs restrict independent changes of genotype values.

F
Fig. 7. Error between the target interval Y r = [y r L , y r U ] and vectors x 1 , x 2 , …, x S and calculates output interval values Y 1 , Y 2 , …, Y S .x 1 , x 2 , …, x S are sampled within the variable domain of application problem.Fitness of the genotype V is evaluated based on Y 1 -Y S .The method for calculating the fitness depends on the problem to which the INN is applied.
Table 1 clearly shows that the CW model required less repairs than the LU model did, which will be a reason for the fact that the CW model could contribute to evolve better INNs.