A QUALITATIVE MODEL OF THE INDEBTEDNESS FOR THE

SPANISH AUTONOMOUS REGIONS

Luis Jimenez, Juan Moreno-Garcia, Jose Jesus Castro-Schez, Victor R. Lopez, Jose Ba

nos

Universidad de Castilla-La Mancha

Departamento de Informatica, Departamento de Economia y Empresa

E.S.I. de Ciudad Real, E.U.I.T.I. de Toledo, F.CC.E. y E. de Albacete

Keywords:

approximate reasoning, linguistic models, economy models, fuzzy logic.

Abstract:

This work shows a fuzzy model of the indebtedness for the Spanish autonomous regions that is obtained using

approximate reasoning and induction methods. So, the algorithm ADRI (M. Delgado, 2001; Jimenez, 1997)

is used to induce a linguistic model composed by a set of fuzzy rules. The quality of this linguistic model will

be checked and its interpretation will be shown.

1 INTRODUCTION

Currently, the technology offers the possibility to

work efﬁciently with a lot of data. We can to use in-

duction methods to detect the relations between sev-

eral variables without previous hypothesis.

In this work we make use of an induction method

to study the indebtedness for the Spanish autonomous

regions from 1986 to 2000. Our aim will be to ﬁnd

the temporal relations among the indebtedness of a

region in a year t and the non ﬁnancial revenues and

the type of interest of that year and the indebtedness

of the region in the year t − 1. To do this, we have

a set of data owning to J. Ba

nos, P. Cillero and V.R.

opez (J. Ba

nos, 2001) that was obtained from several

reliable source, Spanisn National Bank, Ministry of

Finance and Statistic National Institute and Statistic

National Institute (INE).

Our intention is to obtain a model that, in one hand,

will be understandable without difﬁculty, and in the

other hand, can be directly developed from empir-

ical observation. Thus, we are interested in tech-

niques that obtain models given by using fuzzy rules

with linguistic variables (Zadeh, 1975; M. Sugeno,

1991). We suggested make use of the induction algo-

rithm ADRI (M. Delgado, 2001) in order to obtain the

model that describes the run of the Spanish regions in-

debtedness. This algorithm obtains a set of linguistic

rules from data.

The next section gives a background knowledge of

the induction algorithm ADRI. In section 3 the model

of the indebtedness for the Spanish autonomous re-

gions is obtained. After that, an interpretation of this

model is exposed in Section 4. Finally, the conclu-

sions are shown.

2 ADRI: AN INDUCTION

METHOD

ADRI is a generalization of the regression technique.

Firstly, we show some deﬁnitions used in this paper

and, after, we describe brieﬂy the ADRI induction

method.

Let S = {s

, s

. . . s

} be a set of data deﬁned for

the set of values that takes a set of variables X =

, X

. . . X

}, thus, s

= {x

, x

. . . x

, y

Let F be a function that only is known in the set

S, such that, F (s

) = y

. The aim of the regres-

sion methods, expressed by means of a parameterized

function F

, is to minimize the distance between the

real output value y

= F (s

) and its estimated value

using F

The regression methods are different to the classi-

ﬁcation methods. The regression methods consider

continuous output values instead of a set of cate-

gories in the classiﬁcation methods. From this point

of view, the classiﬁcation methods could be consid-

ered a particularization of the regression methods.

Methods based on the successive partitioning of the

problem domain (for example, the decision trees ID3

(Quilan, 1986)) are used like technique of regression

for restrictions in Classiﬁcation and regression trees

275

Jimenez L., Moreno-Garcia J., Jesus Castro-Schez J., R. Lopez V. and Baños J. (2004).

A QUALITATIVE MODEL OF THE INDEBTEDNESS FOR THE SPANISH AUTONOMOUS REGIONS.

In Proceedings of the Sixth International Conference on Enterprise Information Systems, pages 275-280

DOI: 10.5220/0002616502750280

 SciTePress

(CART) (J. Breiman, 1984).

CART is based on a sequence of questions and its

possible answers (structured in tree form) over the

variable values that deﬁne the problem. CART ob-

tains a separate divisions SR = {r

, r

. . . r

} of the

domain of each variable of the problem. It splits the

domains by means of the answers to the questions that

carries out. The algorithm used in this work gener-

alizes the divisions obtained in CART by means of

fuzzy logic (Zadeh, 1965).

In this work, we build the Classiﬁcation and Re-

gression Fuzzy Tree for the Spanish regions indebt-

edness problem and obtain the searched model given

by means of a set of fuzzy rules. It will be the param-

eterized function F

obtained by ADRI.

Now, we expose how the set of rules is obtained.

Let A

be a fuzzy set of the tree node T deﬁned over

the set of data S. The root node contains all data of

the set S, and its membership function is A

: S → 1.

The output associated to the node T is deﬁned using

the membership grade A

) of each data s

and the

output y

(equation 1).

(T ) =

i=1

)

∗ y

i=1

)

(1)

where A

(x) is calculated as min

j=1

)

and A

) is the membership function of a

fuzzy set deﬁned over the domain of the j-th vari-

able.

Equation 2 calculates the estimated error E(T ) of

a node T .

E(T ) =

i=1

(T ) − y)

∗ A

)

i=1

)

(2)

Now, our problem is how to establish a set of ques-

tions to divide the node T . These questions are car-

ried out for each variable, thus, a binary division of

the node T is obtained for each one of the variables.

We suppose binary division for the fuzzy set associ-

ated to the node T (by means of the fuzzy set A

the variable j), that is, p

= {B(x), C(x)} where

= B(x) + C(x). This partition creates two

new nodes and two new fuzzy sets associated to them

(Equations 3 and 4).

) = min(A

), B(x

)) (3)

) = min(A

), C(x

)) (4)

Thus, the obtained rules with this fuzzy sets are:

IF variable

IS A

THEN .....

ELSE

IF variable

IS A

THEN ..

Following the ADRI algorithm, we obtain the pro-

portion of the fuzzy sets B(x) and C(x) in relation to

the fuzzy set A

(Equations 5 and 6).

P (T

) =

i=1

B(x

)

i=1

)

(5)

P (T

) =

i=1

C(x

)

i=1

)

(6)

The quality of this partition is estimated with the

equation 7.

C(T, p

) = (E(T

)P (T

)) + (E(T

)P (T

)) (7)

where p

is the fuzzy partition of the j-th vari-

able.

The selected partition is the one that has the mini-

mum value C(T, p

). This technique generates a hier-

archical fuzzy partition in each one of the variables to

obtain the questions. This process of division obtains

the relevant variables to deﬁne the model. The mech-

anism of division is stopped when some condition is

veriﬁed. In our case, the condition is that the error is

less than a constant c (equation 8).

ERROR = max

T ∈

{E(T )} 6 c (8)

where

T is the set of tree leaves, thus, each

leaves is a fuzzy region of the model.

Equation 9 calculates the ﬁnal output value for each

input value s

(s) =

t∈T

(s)

∗ F

(T )

t∈T

(s)

(9)

Figure 1: Indebtedness for Spanish autonomous regions

from 1986 to 2000

The interested reader is referred to M. Delgado

(M. Delgado, 2001).

3 INDEBTEDNESS MODEL

We want to obtain a model that shows the indebt-

edness for the Spanish autonomous regions from

ICEIS 2004 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS

276

1986 to 2000 (Figure 1). The output variable (the

modelled variable) is the indebtedness for the Span-

ish autonomous region in the year t, and the input

variables are the region indebtedness of the previ-

ous year D/P IB(t − 1), the non ﬁnancial revenues

IN F/P IB(t) and the type of interest T (t). Thus,

the searched function F is shown in Equation 10.

F (D/P IB(t − 1), INF/P IB(t), T (t)) (10)

The fuzzy model F

is developed using the Clas-

siﬁcation and Regression fuzzy tree (ADRI). A fuzzy

rule consist of two components, the ﬁrst one is the an-

tecedent and the second one is the consequent. These

components show the cause-effect relation. The an-

tecedent is formed for aggregation of fuzzy proposi-

tions, such as X is A, where X is the input variable

and A is the fuzzy set deﬁned over its domain. In this

work, the membership function of the fuzzy sets are

trapezoidal functions (see Equation 11). This mem-

bership function is adequate for our problem due to it

is one of the most simple membership function, and

it is easy to construct from the fuzzy sets obtains by

using ADRI (ADRI is deﬁned over trapezoidal sets).

The change to other membership function not implies

substantial modiﬁcations.

A(x, a, b, c, d) =











0 x 6 a

x−a

b−a

a < x < b

1 b 6 x 6 c

d−x

d−c

c < x < d

0 x > d

(11)

Our aim is to deﬁne a system of linguistic rules,

since linguistic labels must be associated to each one

of the fuzzy sets in which is divided the domain of

each input variable (Tables 1, 2 and 3). Each row of

these tables deﬁne a linguistic value. The numbers

which appear in the ﬁrst column of the tables are the

four values that deﬁne the trapezoidal function of the

linguistic value given in its associated second column.

Thereby, we get a set of linguistic variables (Zadeh,

1975).

Table 1: Linguistic Variable D/PIB

[8.00, 8.48, MAX, MAX] Extremely Big

[6.45, 7.20, 8.00, 8.48] Very Big

[3.53, 4.50, 6.45, 7.20] Big

[1.50, 2.10, 3.53, 4.50] Norm

[0.80, 1.10, 1.50, 2.10] Small

[0.40, 0.60, 0.80, 1.10] Very Small

[MIN, MIN, 0.40, 0.60] Extremely Small

The model acquired by ADRI may be observed in

Table 4. Each row is a rule with EB is Extremely Big,

V B is Very Big, B is Big, N is Norm, S is Small,

Table 2: Linguistic Variable INF/PIB

[14.08, 15.62, MAX, MAX] Big

[8.68, 10.84, MAX, MAX] Norm or Big

[8.96, 10.84, 14.02, 15.62] Norm

[MIN, MIN, 8.68, 10.84] Small

Table 3: Linguistic Variable T

[11.05, 12, 49, MAX, MAX] Big

[8.22, 9.65, 11.05, 12.49] Norm

[MIN, MIN, 5.87, 9.65] Small

V S is Very Small and ES is Extremely Small. These

rules are arranged in a decreasing way according to

the percentage of data that are correctly classiﬁed by

the rule. The ﬁrst number indicates the index of the

rule, the following four are the values that the input

variable take, the sixth the rule error and the last per-

centage of instances that has been correctly classiﬁed.

Thus, the rule 10 covers the 7.4% of the data, it has a

error of 0.04%, and it could be read as:

IF D/P IB(t − 1) is Big AND IN F/P IB(t)

is Small AND T (t) is Small THEN

D/P IB(t) is 4.55

Now, let us look at how the model responds to a

situation. Table 5 shows how is carried out the infer-

ence from the data about Andalusia in 1993. Each

row of this table is a rule of the model (see Ta-

ble 4). The ﬁrst number indicates the index of the

rule, the following three are the values that the in-

put variables (D/P IB(t − 1), INF/P (t) and T (t))

take in each rule (value 1), the ﬁfth the minimum

membership grade for the input variables and the last

the value of the output variable. The real output is

D/P IB(t) = 6.6. The data about Andalusia, input

of the model, are 5.3, 18.96 and 10.17 for the input

variables D/P IB(t − 1), INF/P IB and T (t) re-

spectively. The output value obtained by the model is

D/P IB(t) = 6.23.

The cell (1, D/P IB(t − 1)) contains the mem-

bership grade of the value 5.3 to the fuzzy set

[0.80, 1.10, 1.50, 2.00] labelled as Small. The value

into the cell (9, min) represents the minimum value

of the membership of the input values set to the rule

9 (row 9). The cell value (9, D/P (T )) is the output

of the rule 9 multiplied by the cell value (9, min).

The output value is the weighted average of all out-

puts times the min column.

In Figures 2 to 16 we show the predictive perfor-

mance of the model. Each Figure presents the results

of a Spanish region, the real and obtained indebted-

ness (Y

axis) in each year (X axis). The real in-

debtedness is represented by a continuous line. The

A QUALITATIVE MODEL OF THE INDEBTEDNESS FOR THE SPANISH AUTONOMOUS REGIONS

277

Table 4: Obtained Model

N D(t-1) I(t) T(t) D(t) E C

13 ES 0.51 0.03 11.4

5 N N 3.43 0.09 10.7

12 VS 1.06 0.06 10.2

1 S 2.23 0.10 9.4

3 N S S 3.20 0.07 9.3

4 N B 3.18 0.09 8.8

10 B S S 4.55 0.04 7.4

7 EB 8.97 0.02 6.8

11 B N or B S 5.43 0.09 5.4

8 B S N 4.94 0.06 5.3

9 B N or B N 6.23 0.06 4.2

6 VB 7.76 0.05 4.0

14 N N or B S 3.28 0.06 3.7

2 B B 4.96 0.07 2.3

15 N B S 5.10 0.05 0.8

discontinuous line is the obtained indebtedness using

the presented method.

The Spanish regions indebtedness in the year 2001

have been inferred making use of the model (see Ta-

ble 5) and composed with the real values. In Table

6 we show the result obtained, where each row is a

Spanish region (A.R.),D(t − 1), I(t) and T (t) are the

input variables D/P IB(t − 1), INF/P (t) and T (t)

respectively, and Ob. D(t) and R. D(t) are the ob-

tained and real output variable D/P IB(t).

4 INTERPRETATION OF THE

MODEL

The obtained model allows to make an analysis of

the indebtedness for the Spanish autonomous regions

from 1986 to 2000. The aim of our analysis is double:

in one hand, ﬁnd the rules that deﬁne the most of the

study cases; and in the other hand, get the inﬂuence

of each one of the input variables in the indebtedness

prediction problem.

We can observe that the rules 13, 12, 1, 7 and 6

classify correctly the 11.4%, 10.23%, 9.38%, 6.78%

and 3.98% of the whole set of cases respectively, i.e.,

41,77%. On the other hand, we can also observe that

the output of the model (D/P IB(t)) in this rules only

depend on the value of the input variable D/P IB(t−

1). Thus, the behavior of the D/P IB value in time t

is very dependent of the value in time t − 1. It causes

an inertial behavior to the model, that is, when the

variable input D/P IB(t − 1) has small, medium or

big values the output value is small, medium or big.

The rules 5, 4 and 2 (whose antecedent depend

Table 5: Inference process for Andalusia

R D/P(t-1) INF/P(t) T(t) min D/P(T)

1 0 1 1 0 0

2 1 1 0 0 0

3 0 0 0 0 0

4 0 1 0 0 0

5 0 1 1 0 0

6 0 1 1 0 0

7 0 1 1 0 0

8 1 0 1 0 0

9 1 1 1 1 6.23

10 1 0 0 0 0

11 1 1 0 0 0

12 0 1 1 0 0

13 0 1 1 0 0

14 0 0 0 0 0

15 0 1 0 0 0

on the variables D/P IB(t − 1) and T (t)) clas-

sify correctly the 10.74%, 8.83% and 2.30% of the

whole set of cases, i.e., 21,87%. This means that

the D/P IB(t) value is corrected by the T (t) values

when the D/P IB(t) values are Norm or Big.

Finally, the INF/P IB(t) variable is used in the

36.36% of the data.

Thus, we deduce that the indebtedness for the

Spanish autonomous regions in a year t depend on

basically of the indebtedness for the Spanish au-

tonomous regions in a year t − 1, lightly corrected for

the type of interest and minorly for the non ﬁnancial

revenues.

5 CONCLUSION

This work presents a methodology to deﬁne mod-

els. This methodology don’t need knowledge a priori,

except in the variable deﬁnition. This methodology

uses induction technique to obtain the models that are

qualitative and are based on fuzzy logic.

To validate the methodology we obtain a linguis-

tic model of the indebtedness for the Spanish au-

tonomous regions. A brief analysis is doing to show

the behavior of the model. Of the obtained result we

conclude that the methodology obtains a good ﬁrst ap-

proximation to the searched model. The understand-

ing of the model is possible because the model is qual-

itative (linguistic labels).

ICEIS 2004 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS

278

Table 6: Indebtedness values obtained by the model for each

Spanish region in the year 2001

Ob. R.

A.R. D(t-1) I(t) T(t) D(t) D(t)

An. 8.58 20.44 4.5 8.9668 8.8213

Ar. 4.88 10.23 4.5 5.1806 4.8286

As. 4.05 7.10 4.5 3.9534 3.9212

B. 1.85 5.93 4.5 2.8687 1,6450

Cn. 3.30 16.47 4.5 5.1041 3.3405

Ct. 3.08 10.24 4.5 3.2585 2.9857

CL. 2.95 13.12 4.5 3.2837 2.9152

CM. 2.93 12.71 4.5 3.2837 2.8815

Cat. 8.70 11.11 4.5 8.9668 8.7626

CV. 9.58 12.69 4.5 8.9668 9.6902

E. 5.83 15.17 4.5 5.426 5.9000

G. 8.68 19.11 4.5 8.9668 8.8971

M. 4.48 6.36 4.5 4.4296 4.2426

Mu. 4.30 10.38 4.5 4.758 4.3415

R. 2.93 8.91 4.5 3.1953 2.8055

ACKNOWLEDGEMENTS

This work has been funded by the Spanish Ministry of

Science and Technology and Junta de Comunidades

de Castilla-La Mancha under Research Projects ”DI-

MOCLUST” TIC2003-08807-C02-02 and PREDA-

COM PBC-03-004.

REFERENCES

J. Ba

nos, P. Cillero, V. L. (2001). Un modelo de endeu-

damiento economico. Presupuesto y gasto publico, n

26, 2001, 161-172.

J. Breiman, J. Friedman, R. O. S. (1984). Classiﬁcation and

regression tree. Monterey, Ca:Wadsworth.

Jimenez, L. (1997). Modelizacion difusa de sistemas me-

diante tecnicas inductivas. Tesis Doctoral, ETSI Uni-

versidad de Granada.

M. Delgado, A.F. Skarmeta, L. J. (2001). A regression

methodology to induce a fuzzy Model. International

Journal of Intelligent Systems Vol 16, n

2, 169-190,

February.

M. Sugeno, T. Y. (1991). A fuzzy-logic based approach to

qualitative modelling. IEEE Trans Sys Man Cyber,

vol 1, 7-31.

Quilan, J. (1986). Induction of decision tree. Machine

Learning, vol 1, 81-106.

Zadeh, L. (1965). Fuzzy sets. Inform Control, 338-353.

Zadeh, L. (1975). The concept of linguistic variable and its

applications to approximate reasoning part I,II and

III. Inform Sci vol 8 and 9, 199-249 301-357 43-80.

Figure 2: Andalusia Indebtedness (An.)

Figure 3: Aragon indebtedness (Ar.)

Figure 4: Asturias Indebtedness (As.)

Figure 5: Balearic Islands Indebtedness (B.)

Figure 6: Valencian Community Indebtedness (C.V.)

A QUALITATIVE MODEL OF THE INDEBTEDNESS FOR THE SPANISH AUTONOMOUS REGIONS

279

Figure 7: Canary Isles Indebtedness (Cn.)

Figure 8: Cantabria Indebtedness (Ct.)

Figure 9: Castile and Leon Indebtedness (CL.)

Figure 10: Catalonia Indebtedness (Cat.)

Figure 11: Castile and La Mancha Indebtedness (CM.)

Figure 12: Extremadura Indebtedness (E.)

Figure 13: Galicia Indebtedness (G.)

Figure 14: Madrid Indebtedness (M.)

Figure 15: Murcia Indebtedness (Mu.)

Figure 16: La Rioja Indebtedness (R.)

ICEIS 2004 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS

280