An Analysis of Geometric Semantic Crossover: A Computational

Geometry Approach

Mauro Castelli

, Luca Manzoni

, Ivo Gonc¸alves

1,5

, Leonardo Vanneschi

, Leonardo Trujillo

and Sara Silva

4,5

NOVA IMS, Universidade Nova de Lisboa, 1070-312 Lisboa, Portugal

DISCo, Universit

a degli Studi di Milano Bicocca, 20126 Milano, Italy

Posgrado en Ciencias de la Ingenier

ıa, Instituto Tecnol

ogico de Tijuana, Tijuana, Mexico

BioISI, Faculty of Sciences, University of Lisbon, Campo Grande, 1749-016 Lisbon, Portugal.

CISUC, Department of Informatics Engineering, University of Coimbra, 3030-290 Coimbra, Portugal

Keywords:

Genetic Programming, Semantics, Convex Hull.

Abstract:

Geometric semantic operators have recently shown their ability to outperform standard genetic operators on

different complex real world problems. Nonetheless, they are affected by drawbacks. In this paper, we focus

on one of these drawbacks, i.e. the fact that geometric semantic crossover has often a poor impact on the

evolution. Geometric semantic crossover creates an offspring whose semantics stands in the segment joining

the parents (in the semantic space). So, it is intuitive that it is not able to ﬁnd, nor reasonably approximate, a

globally optimal solution, unless the semantics of the individuals in the population “contains” the target. In this

paper, we introduce the concept of convex hull of a genetic programming population and we present a method

to calculate the distance from the target point to the convex hull. Then, we give experimental evidence of the

fact that, in four different real-life test cases, the target is always outside the convex hull. As a consequence,

we show that geometric semantic crossover is not helpful in those cases, and it is not even able to approximate

the population to the target. Finally, in the last part of the paper, we propose ideas for future work on how to

improve geometric semantic crossover.

1 INTRODUCTION

Methods to integrate semantic awareness gained

a vast popularity in the Genetic Programming

(GP) (Koza, 1992) community in the last few

years (Vanneschi et al., 2014a). In particular, in

the last three years, a noteworthy attention was

dedicated to Geometric Semantic GP (GSGP), a

version of GP introduced by Moraglio and coau-

thors in 2012 (Moraglio et al., 2012), that uses so

called Geometric Semantic Operators (GSOs), in-

stead of the traditional crossover and mutation.

Even though the term semantics can have sev-

eral different interpretations, it is a common trend

in the GP community (and this is also the deﬁnition

we adopt here) to identify the semantics of a solu-

tion with the vector of its output values on the training

data (Vanneschi et al., 2014a; Moraglio et al., 2012).

Under this perspective, a GP individual can be identi-

ﬁed with a point (its semantics) in a multidimensional

space that we call semantic space, which has a num-

ber of dimensions equal to the number of training in-

stances. The objective of GSOs is to create transfor-

mations on the syntax of individuals that correspond

to precise operators of Genetic Algorithms (GAs) in

the semantic space. More in particular, GSOs are Ge-

ometric Semantic Crossover (GSXO) and Geometric

Semantic Mutation (GSM). GSXO corresponds to ge-

ometric crossover in the semantic space, in the sense

that it generates an offspring whose semantics stands

in the segment joining the semantics of the parents.

GSM corresponds to geometric mutation (also called

ball or box mutation (Vanneschi et al., 2013)) in the

semantic space, in the sense that if we mutate an in-

dividual x, we obtain an individual y such that the se-

mantics of y is a weak perturbation of the semantics

of x.

One of the motivations for the success of GSGP

probably lies in the fact that GSOs induce a uni-

modal ﬁtness landscape on any supervised learning

Castelli, M., Manzoni, L., Gonçalves, I., Vanneschi, L., Trujillo, L. and Silva, S.

An Analysis of Geometric Semantic Crossover: A Computational Geometry Approach.

DOI: 10.5220/0006056402010208

In Proceedings of the 8th International Joint Conference on Computational Intelligence (IJCCI 2016) - Volume 1: ECTA, pages 201-208

ISBN: 978-989-758-201-1

201

problem (like for instance classiﬁcation or regres-

sion), thus favoring GP evolvability. Also thanks to

an efﬁcient implementation of GSGP that was de-

ﬁned in 2013 (Vanneschi et al., 2013; Castelli et al.,

2015a), it was possible to successfully apply GSGP

to several different complex real-life applications (see

for instance (Castelli et al., 2014, 2015b, 2013a)).

However, GSGP has at least the following recog-

nized drawbacks: (1) GSOs generate individuals that

are larger than their parents, and this causes a rapid

growth in the size of the individuals in the popula-

tion; (2) GSXO was shown to be quite ineffective on

a large set of applications.

The former problem is widely discussed in literature.

The implementation proposed in (Vanneschi et al.,

2013; Castelli et al., 2015a) is a workaround to this

problem, in the sense that, although not limiting the

code growth, it makes the system not only usable in

practice, but even more efﬁcient than standard GP.

This paper focuses on the latter drawback, already

pointed out in the literature for instance in (Moraglio

and Mambrini, 2013), where a purely mutation-

based GSGP was proposed, after recognizing the use-

lessness of GSXO. We believe that one of the reasons

for the poor performance of GSXO lies in its geomet-

ric property. In fact, as we said above, GSXO gener-

ates an offspring whose semantics stands in the seg-

ment joining the semantics of the parents. In this per-

spective, if we imagine a GP population as a cloud

of points in the semantic space, we could informally

say that crossover is only able to generate points that

are “inside” the cloud. So, if the target (that is also a

known point in the semantic space) is not contained

inside the cloud, GSXO will never be able to gener-

ate it. Also, if the target is quite far from the cloud,

GSXO will not be even able to reasonably approxi-

mate it.

The main objective of this paper is to conﬁrm this

hypothesis by means of a set of experiments. For

achieving this objective, we need a formal tool that

allows us to “capture” our idea of cloud of individu-

als in the semantic space. More speciﬁcally, it would

be useful to have a formal method to indicate what we

could informally call the “border” of a cloud. In this

way, we could use this tool both for understanding if a

given point is “inside” or “outside” the cloud and for

calculating the distance from one point to the cloud.

Contributions of this paper are: (1) Introduction of

the concept of convex hull, as a tool to represent the

“border” of a set of points in the semantic space. (2)

Introduction of a method to understand if a point is

contained in the convex hull or not. (3) Introduction

of a method to calculate the distance from a point to

the convex hull.

The ﬁrst contribution has already been considered

in (Moraglio, 2011), where authors showed that all the

evolutionary algorithms using geometric crossover

with no mutation perform the same form of convex

search regardless of the underlying representation, the

speciﬁc selection mechanism, the speciﬁc offspring

distribution, the speciﬁc search space, and the prob-

lem at hand.

With the contributions provided in our study, we

are able to monitor the convex hull of the points rep-

resenting the semantics of all the individuals in the

population during the GP evolution. In particular, we

able to study the evolution of the distance from the

target to the convex hull during a GP run.

In this paper, we compare two GSGP systems: the

ﬁrst one uses both GSXO and GSM; the second one

uses only GSXO. The different behaviour of the lat-

ter, compared to the ﬁrst, should allow us to shade a

light on the limitations of GSXO. As test cases for

this experimental study, we have decided to use four

real-life symbolic regression problems from the UCI

repository (Lichman, 2013).

2 GEOMETRIC SEMANTIC

OPERATORS

GSOs are becoming more and more popular in the

GP community (Vanneschi et al., 2014a), probably

because of their property of inducing a unimodal ﬁt-

ness landscape on any problem consisting in matching

sets of input data into known targets (like for instance

supervised learning problems, such as regression and

classiﬁcation). To have an intuition of this property

(whose proof can be found in (Moraglio et al., 2012)),

let us ﬁrst consider a Genetic Algorithms (GAs) prob-

lem in which the unique global optimum is known and

the ﬁtness of each individual (to be minimized) cor-

responds to its distance to the global optimum (our

reasoning holds for any employed distance). In this

problem, if we use, for instance, ball mutation (Kraw-

iec and Lichocki, 2009) (i.e. a variation operator that

slightly perturbs some of the coordinates of a solu-

tion), then any possible individual different from the

global optimum has at least one ﬁtter neighbor (indi-

vidual resulting from its mutation). So, there are no

local optima. In other words, the ﬁtness landscape is

unimodal, and consequently the problem is character-

ized by a good evolvability. Similar considerations

hold for many types of crossover, including various

kinds of geometric crossover (Krawiec and Lichocki,

2009).

Now, let us consider the typical GP problem of

ﬁnding a function that maps sets of input data into

ECTA 2016 - 8th International Conference on Evolutionary Computation Theory and Applications

202

known target values (as we said, regression and clas-

siﬁcation are particular cases). The ﬁtness of an indi-

vidual for this problem is typically a distance between

its predicted output values and the target ones (error

measure). GSOs simply deﬁne transformations on the

syntax of the individuals that correspond to geomet-

ric crossover and ball mutation in the semantic space,

thus allowing us to map the considered GP problem

into the previously discussed GA problem.

Geometric semantic crossover (GSXO)

generates, as the unique offspring of parents

, T

: R

→ R, the expression:

= (T

· T

) + ((1 − T

) · T

)

where T

is a random real function whose output val-

ues range in the interval [0, 1].

Analogously, geometric semantic muta-

tion (GSM) returns, as the result of the mutation of

an individual T : R

→ R, the expression:

= T + ms · (T

− T

)

where T

and T

are random real functions with

codomain in [0, 1] and ms is a parameter called mu-

tation step.

Moraglio et al. (Moraglio et al., 2012) show that

GSXO corresponds to geometric crossover in the se-

mantic space (i.e. the point representing the offspring

stands on the segment joining the points representing

the parents) and GSM corresponds to ball mutation

on the semantic space (the semantics of the individ-

ual generated by mutation is a weak perturbation of

the semantics of the individual to which mutation is

applied), and thus GSM induces a unimodal ﬁtness

landscape on the above mentioned types of problem.

3 CONVEX HULL

This section reports simple computational geometry

concepts that will be used in the following sections

to analyze the performance of GSXO. The following

deﬁnitions are taken from (de Berg et al., 2008). A

subset S of the plane is called convex if and only if

for any pair of points p, q ∈ S the line segment pq is

completely contained in S. The convex hull CH(S)

of a set S is the smallest convex set that contains S.

In other terms, it is the intersection of all convex sets

that contain S. To simplify, it is possible to visualize

Here we report the deﬁnition of the geometric seman-

tic operators as given by Moraglio et al. for real functions

domains, since these are the operators we will use in the ex-

perimental phase. For applications that consider other types

of data, the reader is referred to (Moraglio et al., 2012).

what the convex hull looks like by a thought experi-

ment (taken from (de Berg et al., 2008)). Imagine that

the points are nails sticking out of the plane. Take

an elastic rubber band, hold it around the nails, and

let it go. It will snap around the nails, minimizing its

length. The area enclosed by the rubber band is the

convex hull of the set of points. This leads to an al-

ternative deﬁnition of the convex hull of a ﬁnite set P

of points in the plane: it is the unique convex polygon

whose vertices are points from P and that contains all

points of P. It is possible to prove that this deﬁnition

is equivalent to the one given earlier (de Berg et al.,

2008).

While several algorithms have been proposed to

efﬁciently determine the convex hull of a set of

points, the large majority of them considers only a

2-dimensional or 3-dimensional space. In this work,

we need to ﬁnd the convex hull in an n-dimensional

space, where the size of the space n is determined by

the independent variables that characterize the partic-

ular application at hand.

For this reason, in our work we follow a differ-

ent approach. Instead of directly building the convex

hull, we try to understand if a point (that successively

in our experiments will be the target) is inside the con-

vex hull formed by a set of points (i.e., the semantics

of the candidate solutions). To do that we follow the

method reported in Chapter 11 of (Boyd and Vanden-

berghe, 2004). Basically the idea is to solve a system

of linear equations subjects to some constraints. If a

solution to the system exists then it is possible to con-

clude that a given point is inside the convex hull.

The method, described in detail in (Boyd and Van-

denberghe, 2004), re-adapted by us for GP, is the fol-

lowing: let n be the number of individuals in a popu-

lation and m the number of training samples. Let x

i, j

be the signed error of the i-th individual on the j-th

training sample. We can the build the following sys-

tem of linear equations in the n variables a

, . . . , a

1,1

+ a

2,1

+ . . . + a

n,1

= 0

1,2

+ a

2,2

+ . . . + a

n,2

= 0

1,m

+ a

2,m

+ . . . + a

n,m

= 0

+ a

+ . . . + a

= 1

≥ 0 ∀i ∈ {1, . . . , n}

The previous system has a solution (i.e., a vector

, a

, . . . , a

) if and only if the optimal individual

(i.e., the one that has zero error on the training set) can

be expressed as a linear combination of the existing

individuals where the coefﬁcients are a

, a

, . . . , a

This is actually equivalent of saying that the opti-

mum resides in the convex hull given by the points

An Analysis of Geometric Semantic Crossover: A Computational Geometry Approach

203

= (x

i,1

, x

i,2

, . . . , x

i,m

) (i.e., the signed error vectors

of the individuals). This is a powerful tool that allows

us to build the optimal solution by combining exist-

ing individuals. However, this is possible only when

the above mentioned system of linear equations has a

solution (i.e., the optimum is in the convex hull).

Interestingly, it is also possible to change the pre-

vious problem, hence achieving more information, by

ﬁnding the distance from the optimum to the convex

hull, instead of simply asking whenever the optimum

is inside it. This can be performed by solving the fol-

lowing linear programming problem in the variables

, . . . , a

, e

, . . . , e

, and ¯e

, . . . , ¯e

minimize e

+ ¯e

+ e

+ ¯e

+ . . . + e

+ ¯e

with constraints:

1,1

+ a

2,1

+ . . . + a

n,1

+ e

− ¯e

= 0

1,2

+ a

2,2

+ . . . + a

n,2

+ e

− ¯e

= 0

1,m

+ a

2,m

+ . . . + a

n,m

+ e

− ¯e

= 0

+ a

+ . . . + a

= 1

≥ 0, ∀i ∈ {1, . . . , n}

≥ 0, ∀i ∈ {1, . . . , m}

¯e

≥ 0, ∀i ∈ {1, . . . , m}

First of all, notice that the previous system has al-

ways a solution and that only one between e

and ¯e

can be non-zero when

∑

i=1

+ ¯e

) is minimized. The

term e

+ ¯e

is always positive and represents the dis-

tance (along the i-th coordinate) from the convex hull

to the global optimum. Recall that since all points are

in an m-dimensional space,

∑

i=1

+ ¯e

) represents

a distance - the commonly called taxicab distance -

from the convex hull to the global optimum. We have

decided not to use the more common euclidean dis-

tance since it would have required a non-linear target,

making the problem non-linear. When the distance

is zero the global optimum is inside the convex hull

and, as before, the values of a

, . . . , a

give us a way

to combine existing solutions to build an optimal so-

lution. To illustrate these facts, we can observe in

Figure 1 the convex hull generated by a population

of 10 individuals (represented as points) and what is

the distance from the optimum (the vector of all zeros,

representing no error on the training set) to the convex

hull. The point inside the convex hull and closest to

the optimum is (c), which is a linear combination of

two other individuals, (a) and (b).

Since linear programming problems can be efﬁ-

ciently solved by the internal points method (Bonnans

et al., 2006) or by the simplex method (even if, con-

trarily to the former one, the latter can have an expo-

nential runtime), it is feasible to compute the distance

from the optimum to the convex hull generated by the

current population at each generation.

4 EXPERIMENTAL SETTINGS

As test problems we used four symbolic regression

problems from the UCI repository (Lichman, 2013).

The problems were chosen to have a low number of

features to reduce the number of variables when com-

puting the distance from the convex hull:

• Airfoil Self-Noise (Airfoil), with 1502 instances

and 5 features;

• Concrete Compressive Strength (Concrete), with

1029 instances each with 8 features;

• Concrete Slump Test (Slump), with 102 instances

and 9 features;

• Yacht Hydrodynamics (Yacht), with 307 instances

each with 6 features.

Each dataset was split into 100 pairs of training and

test sets, the former containing 70% of the instances

(chosen at random with uniform distribution), and the

latter the remaining 30% of the instances. For all the

considered test problems, a total of 100 runs have

been performed with each technique. In each run, a

different partition between training and test data has

been considered. All the runs used populations of 100

individuals allowed to evolve for 1000 generations.

Trees initialization was performed using the Ramped

Half-and-Half method (Koza, 1992) with a maximum

initial depth equal to 6. The function set contained

the four binary arithmetic operators, including pro-

tected division as in (Koza, 1992). The terminal set

contained a number of variables equal to the number

of features in the dataset, plus 100 random constants.

These constants were generated randomly with uni-

form distribution in the range [−100, 100] at the be-

ginning of each run. Survival from one generation

to the other was always guaranteed to the best indi-

vidual of the population (elitism). A random muta-

tion step (generated with uniform distribution in the

range [0, 1]) has been considered in each mutation

event. GSXO and GSM probabilities were equal to

0.9 and 0.5 respectively. These rates have been se-

lected based on the guidelines reported in (Castelli

et al., 2015c).

Since mutation is the only operator that can pro-

duce solution outside the convex hull generated by the

current population, we have also explored the effect

of a crossover-only evolution by performing the same

tests with a mutation rate of zero (i.e., no mutation).

Notice that, in this way, the improvement possible by

ECTA 2016 - 8th International Conference on Evolutionary Computation Theory and Applications

204

Convex hull

Optimum

Individuals

Distance from the optimum:

i=1

+ ¯e

)

(c)

(a) (b)

Figure 1: An example of how the convex hull is related to the optimal solution. Point (c), which is the closest point in the

convex hull to the target, is a linear combination between (a) and (b), that are points that belong to the border of the convex

hull.

GSGP are limited by the best possible solution that

can be found inside the convex hull, therefore we ex-

pect that the ﬁtness will rapidly reach a plateaux from

which improvements are no further possible.

At each generation of each test we computed

the distance of the optimum from the convex

hull using the lp solve linear programming solver

(http://sourceforge.net/projects/lpsolve/). In total,

over 8 × 10

linear programming problems have been

solved.

5 EXPERIMENTAL RESULTS

In the ﬁrst part of our experimental study, we ana-

lyzed the ﬁtness of the best individual in the popula-

tion at each generation, both on training and test sets.

The two systems that have been compared are the

“usual” GSGP, that employs both GSXO and GSM

and a GSGP systems that uses only GSXO. Results of

this analysis are reported in Figure 2. As it is possi-

ble to see, for all the studied test problems a similar

pattern appears: when GSXO is the only genetic op-

erator used, the best training and test ﬁtness do not

improve during the evolution. In other words, no evo-

lution is taking place and the best individual obtained

at the end of the search process has a ﬁtness that is

comparable to the one that was found in the very ﬁrst

part of the evolution. The behaviour of training and

test ﬁtness is different when the search process uses

both GSXO and GSM. In fact, for all the test prob-

lems, both training and test ﬁtness keep improving

steadily until the end of the run.

For a better understanding of the behaviour just

observed, in the second part of the experimental anal-

ysis we have taken into account, at each generation,

the distance between the global optimum and the con-

vex hull deﬁned by the current population. Results of

this analysis (obtained using a GSGP system that uses

both GSXO and GSM and a GSGP system that uses

GSXO only) are reported in Figure 3.

If we consider GSGP that uses only GSXO, in all

the test problems the distance from the convex hull

to the target remains practically constant during the

whole evolution. Hence, by only using GSXO, GSGP

is not able to “push” the search process close to a

globally optimal solution. The solutions always re-

main inside the convex hull deﬁned by the initial pop-

ulation. Looking at Figure 3, it is also possible to see

a “jump” in the very ﬁrst generations (usually the ﬁrst

two or three generations). In fact, the initial popula-

tion usually contains several highly semantically dif-

ferent solutions. After the selection takes place in the

very ﬁrst generations, several of these solutions (the

ones with poor ﬁtness) do not survive and the convex

hull, intuitively, covers a smaller area (i.e., hypervol-

ume) of the semantic space. As we can observe, in

all the single runs we performed, the global optimum

is never enclosed in the convex hull, which clearly

makes GSXO practically useless. In fact, all the indi-

viduals created by GSXO will lie in the convex hull

obtained after the ﬁrst generation of the search pro-

cess. The situation is different when a combination of

GSXO and GSM is used. In this case, in all the stud-

ied test problems, the distance from the convex hull to

the global optimum steadily decreases for the whole

evolution.

To conclude the experimental analysis, we study

the relation between ﬁtness (Figure 2) and distance

(Figure 3). Figure 4 reports the scatter plots of the

training ﬁtness with respect to the distance. Look-

ing at these plots, it is clear that, as expected, the two

quantities are strongly correlated. It is worth pointing

out that ﬁtness is not exactly equal to the distance of

the convex hull to the target: in principle, the convex

hull changes at each iteration in position and size.

An Analysis of Geometric Semantic Crossover: A Computational Geometry Approach

205

0 200 400 600 800 1000

Fitness (RMSE)

Generations

train

train (no mut)

test

test (no mut)

0 200 400 600 800 1000

Fitness (RMSE)

Generations

train

train (no mut)

test

test (no mut)

(a) (b)

0 200 400 600 800 1000

Fitness (RMSE)

Generations

train

train (no mut)

test

test (no mut)

11.5

12.5

13.5

14.5

0 200 400 600 800 1000

Fitness (RMSE)

Generations

train

train (no mut)

test

test (no mut)

Figure 2: Training and test ﬁtness for the considered test problems. Median calculated over 100 runs. (a) Airfoil dataset,

(b) Slump, (c) Concrete, (d) Yacht.

4000

6000

8000

10000

12000

14000

16000

18000

20000

22000

0 100 200 300 400 500 600 700 800 900 1000

Distance

Generations

median

median-nomut

100

150

200

250

300

350

400

0 100 200 300 400 500 600 700 800 900 1000

Distance

Generations

median

median-nomut

(a) (b)

5000

5500

6000

6500

7000

7500

8000

8500

9000

0 100 200 300 400 500 600 700 800 900 1000

Distance

Generations

median

median-nomut

1600

1700

1800

1900

2000

2100

2200

2300

0 100 200 300 400 500 600 700 800 900 1000

Distance

Generations

median

median-nomut

Figure 3: Distance between the convex hull and the global optimum. Median calculated over 100 runs. (a) Airfoil dataset,

(b) Slump, (c) Concrete, (d) Yacht.

6 CONCLUSIONS AND FUTURE

WORK

This paper contains a study aimed at motivating the

poor performance of geometric semantic crossover in

geometric semantic genetic programming. We could

informally explain our intuition as follows: since it

creates offspring that stand on the segment joining

the parents in the semantic space, geometric semantic

crossover is only able to create individuals that stand

ECTA 2016 - 8th International Conference on Evolutionary Computation Theory and Applications

206

2000

3000

4000

5000

6000

7000

8000

9000

10000

11000

12000

13000

2 4 6 8 10 12 14 16

Distance

Fitness (RMSE)

100

120

140

160

180

200

220

240

1.5 2 2.5 3 3.5 4 4.5

Distance

Fitness (RMSE)

(a) (b)

4500

5000

5500

6000

6500

7000

8 8.5 9 9.5 10 10.5 11 11.5 12

Distance

Fitness (RMSE)

1200

1300

1400

1500

1600

1700

1800

1900

2000

2100

8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13

Distance

Fitness (RMSE)

Figure 4: Scatter plot showing the correlation between training ﬁtness and the distance from the convex hull to the global

optimum. (a) Airfoil dataset, (b) Slump, (c) Concrete, (d) Yacht.

inside the area deﬁned by the individuals already ex-

isting in the population. If the target is very far from

that area, geometric semantic crossover is not be able

to ﬁnd it. To corroborate our hypothesis, we intro-

duced a method to check whether a given point is con-

tained in the convex hull or not, and a method to cal-

culate the distance of a point to the convex hull. Using

these notions, we have been able to experimentally

demonstrate the appropriateness of our interpretation

about the poor performance of geometric semantic

crossover. In the ﬁrst part of our experimental study,

we have considered four real-life symbolic regression

applications and we have shown that a system of ge-

ometric semantic genetic programming that uses only

geometric semantic crossover was not able to evolve

at all on those problems. In other words, the best ﬁt-

ness at the end of a run is comparable to the one that

was found in the very ﬁrst generations, both on the

training and on the test set. At the same time, a system

of geometric semantic genetic programming that uses

both geometric semantic crossover and geometric se-

mantic mutation is able to evolve, steadily improving

ﬁtness until the end of the run, both on the training

and test sets. This conﬁrms a behaviour that had al-

ready been observed several times in the literature:

geometric semantic crossover gives a practically null

contribution to the evolution, while the most useful

genetic operator of geometric semantic genetic pro-

gramming is geometric semantic mutation. As a sec-

ond step of our experimental analysis, we have stud-

ied the evolution of the distance from the convex hull

deﬁned by the current population to the target, both

when geometric semantic crossover is the only used

genetic operator and when it is used with geomet-

ric semantic mutation. The presented results clearly

show that when geometric semantic crossover is used

in isolation, the distance from the convex hull to the

target remains practically constant during the whole

evolution, instead of the case when both operators are

used, in which it steadily decreases for the whole du-

ration of the run. This motivates the poor usefulness

of crossover, corroborating our intuition: if (as in the

studied test cases) the convex hull is rather far from

the target, crossover is virtually useless, since it will

never be able to generate a global optimum.

These ﬁndings pave the way for future work. In

particular, we identify the possibility of searching for

a set of individuals in the population, to which geo-

metric semantic mutation could be applied, in such a

way that the new convex hull, obtained after this mu-

tation, contains the target. This is a very important

objective, but still would use geometric semantic mu-

tation as a crucial operator, thus conﬁrming the idea

that geometric semantic crossover, in isolation, is not

effective.

An Analysis of Geometric Semantic Crossover: A Computational Geometry Approach

207

ACKNOWLEDGEMENT

This research was partially supported by CONACYT

Basic Science Research Project No. 178323, DGEST

exico) Research Projects No. 5149.13-P and TIJ-

ING-2012-110, TecNM (M

exico) Research Projects

5414.14-P and 5621.15-P, as well as by FP7- Marie

Curie-IRSES 2013 European Commission program

with project ACoBSEC with contract No. 612689 and

also by BioISI R&D unit, UID/MULTI/04046/2013

funded by FCT/MCTES/PIDDAC, Portugal.

REFERENCES

Bonnans, J. F., Gilbert, J. C., Lemar

echal, C., and Sagas-

tiz

abal, C. A. (2006). Numerical Optimization: Theo-

retical and Practical Aspects (Universitext). Springer-

Verlag New York, Inc., Secaucus, NJ, USA.

Boyd, S. and Vandenberghe, L. (2004). Convex Optimiza-

tion. Cambridge University Press, New York, NY,

USA.

Castelli, M., Castaldi, D., Giordani, I., Silva, S., Vanneschi,

L., Archetti, F., and Maccagnola, D. (2013a). An ef-

ﬁcient implementation of geometric semantic genetic

programming for anticoagulation level prediction in

pharmacogenetics. In Progress in Artiﬁcial Intelli-

gence, pages 78–89. Springer Berlin Heidelberg.

Castelli, M., Silva, S., and Vanneschi, L. (2015a). A

c++ framework for geometric semantic genetic pro-

gramming. Genetic Programming and Evolvable Ma-

chines, 16(1):73–81.

Castelli, M., Vanneschi, L., and Felice, M. D. (2015b).

Forecasting short-term electricity consumption using

a semantics-based genetic programming framework:

The south italy case. Energy Economics, 47:37 – 41.

Castelli, M., Vanneschi, L., and Popovi

c, A. (2015c). Pa-

rameter evaluation of geometric semantic genetic pro-

gramming in pharmacokinetics. International journal

of bio-inspired computation, pages 1 – 10. To appear.

Castelli, M., Vanneschi, L., and Silva, S. (2014). Prediction

of the uniﬁed parkinson’s disease rating scale assess-

ment using a genetic programming system with ge-

ometric semantic genetic operators. Expert Systems

with Applications, 41(10):4608 – 4616.

de Berg, M., Cheong, O., van Kreveld, M., and Overmars,

M. (2008). Computational geometry. In Computa-

tional Geometry, pages 1–17. Springer Berlin Heidel-

berg.

Koza, J. R. (1992). Genetic Programming: On the Pro-

gramming of Computers by Means of Natural Selec-

tion. MIT Press, Cambridge, MA, USA.

Krawiec, K. and Lichocki, P. (2009). Approximating geo-

metric crossover in semantic space. In GECCO ’09:

Proceedings of the 11th Annual conference on Genetic

and evolutionary computation, pages 987–994, Mon-

treal. ACM.

Lichman, M. (2013). UCI machine learning repository.

Moraglio, A. (2011). Abstract convex evolutionary search.

In Proceedings of the 11th Workshop Proceedings on

Foundations of Genetic Algorithms, FOGA ’11, pages

151–162, New York, NY, USA. ACM.

Moraglio, A., Krawiec, K., and Johnson, C. (2012). Ge-

ometric semantic genetic programming. In Parallel

Problem Solving from Nature - PPSN XII, volume

7491 of Lecture Notes in Computer Science, pages

21–31. Springer Berlin Heidelberg.

Moraglio, A. and Mambrini, A. (2013). Runtime analysis of

mutation-based geometric semantic genetic program-

ming for basis functions regression. In Proceedings of

the annual international conference on Genetic and

Evolutionary Computation, GECCO ’13, pages 989–

996, New York, NY, USA. ACM.

Vanneschi, L., Castelli, M., Manzoni, L., and Silva, S.

(2013). A new implementation of geometric seman-

tic GP and its application to problems in pharmacoki-

netics. In Proceedings of the 16th European Con-

ference on Genetic Programming, EuroGP 2013, vol-

ume 7831 of LNCS, pages 205–216, Vienna, Austria.

Springer Verlag.

Vanneschi, L., Castelli, M., and Silva, S. (2014a). A survey

of semantic methods in genetic programming. Genetic

Programming and Evolvable Machines, 15(2):195–

214.

ECTA 2016 - 8th International Conference on Evolutionary Computation Theory and Applications

208