Usage of Positional Representations

in Tasks of Revealing Logical Regularities

Yury Laptin

, Alexander Vinogradov

and Yury Zhuravlev

Glushkov Institute of Cybernetics of the Ukrainian National Academy of Sciences

Academician Glushkov pr. 40, 03680 Kiev, Ukraine

orodnicyn Computing Centre of the Russian Academy of Sciences

Vavilov str. 40, 119333 Moscow, Russian Federation

Abstract. A new approach to the problem of description of clusters in the form

of sets of logical regularities is developed. Special attention is paid to detection of

steady interrelations in data which are valid for considerable shares of the sample.

The approach is based on usage of fast bit operations and positional method of

data representation. New criterion of adjacency is proposed for high-level points

of the representation, and it’s used further in the process of assembling such

points in maximal hyper-parallelepipeds corresponding to the best regularities.

The method can be used as preliminary step in various tasks related to the search

of essential logical regularities and substantial interpretation of data.

1 Introduction

Numerous approaches to the problem of extraction of knowledge from numerical data

sample are used recently. Special attention is paid to methods of detection of steady

interrelations in data which are valid for considerable shares of the sample. So, in the

area of IP there are widely used methods where some hierarchical division of set of

objects into blocks of the increasing size is set in advance. At various levels of hierarchy

interrelations of speciﬁc types appear presented and can be processed by specialized

procedures. It is supposed that at top levels the most important laws can be found out

and used further.Among such approachesit is necessary to note varioustransformations

of type coarse-to-ﬁne, various scale-spaces, representations with multiple resolution,

space ﬁlling curves, quad - and octo-trees [1] [2]. Positional representation is among

them, too [3].

At all advantages of the mentioned methods, it is possible to note some obvious

shortcomings. Use of concrete hierarchy is rather a severe constraint which not always

corresponds to the nature of the data. In some cases shortcomings are connected also

with spatial restrictions having origin from borders of an image and from the direction

of basic axes. As it is known, transformation and conformity problems between differ-

ent representations describing the same object in the form of quad- or octo-trees, are

among the main obstacles to wider use of these representations. Many of noted lacks

Laptin Y., Vinogradov A. and Zhuravlev Y.

Usage of Positional Representations in Tasks of Revealing Logical Regularities.

DOI: 10.5220/0002963501000104

In Proceedings of the Third International Workshop on Image Mining Theory and Applications (VISIGRAPP 2010), page

ISBN: 978-989-674-030-6

are overcome in the approaches based on the use of language of Logical Regularities

(LR) for the describing clusters in recognition tasks [4] [5]. In this work some possibil-

ities of use of positional data representation for search of description of the classes and

the decision rule in terms of LR are considered.

2 Logical Regularities and Positional Coordinates

Standard statement of a problem of recognition in R

for K classes is considered. The

exact decision of a problem results in a segmentation of all the space on K subareas.

Let X ⊂ R

is one of segments corresponding to the class k. Logical regularity LR

is represented by a conjunction of kind

where each condition R

is a couple of in-

equalities A

< X

< B

. To conjunction R

there corresponds a hyper-parallelepiped

in X. Some disjunction

can provide a covering of segment of the class k. The

covering can possess those or other properties depending on the choice of A

, B

. In

formal representation here the usual logic is used, and the form of condition R

has

simple intuitive sense. As a result of automatic search of representation

there is

a list of LR that is sufﬁcient for a class. Logical regularities are presented in a form that

is clear to the human acquisition, and they can contain important new information. The

less number of indices i and j used in

, the bigger average share of sample cov-

ered by each hyper-parallelepiped, and the more important law it represents. Whatever

the origin of exact solution of a problem of recognition, it is possible to try to trans-

form it to appropriate representation in terms of LR and to use noted above advantage

in substantial interpretation.

We will build this transformation by means of positional data representation. Posi-

tional representation implies setting some discrete grid D

⊂ R

that is common for

all segments, where |D| = 2

. For a grid points x = (x

, x

, , x

) the conversion of

the points in their positional representations corresponds to effectively carried out trans-

formation on bit slices in D

when each m-bit of binary representation of the number

results in p(n)-bit of binary representation of the m-digit in 2

-ary representation

of value representing the vector as a whole. Here m ≤ d, and function p(n) deﬁnes

some permutation on the set {1, 2, ..., N }. As a result there is linearly ordered scale

S of length 2

, representing one-to-one all points of the grid in the form of a curve

ﬁlling the space D

densely. We consider Z-scanning of a grid, when p(n) = n. At a

choice of other types of scanning p(n) it is possible to achieve more smooth ﬁlling of

the grid, for example, in the form of Peano curve, etc.

3 Searching Maximal Hyper-parallelepipeds in Positional

Representation of Segments

For a given digitization D

, all segments X

⊂ R

corresponding to classes k are

found as solutions of the task of recognition, and they set together a K-valued function

f deﬁned on the scale S. As it is known, m-bit in 2

-ary positional representation

corresponds to some n-dimensional cube of size 2(

m−1)N

. We will name such cube as

a m-point. We will search for situations when there are homogeneouscubes as parts of a

101

segment of some class k which can be united in hyper-parallelepiped of bigger volume.

We will see, how it can be made on the basis of research of the structure of function f.

Let’s select at ﬁrst all segments of scale S where value f does not vary. We will

sequentially check points of S and build a list

f, containing records of sort (b

...

d−m+1

, k). Here the binary sequence (b

...b

d−m+1

consists of d − m + 1 bits, and

the last bit ends the positional representation of some m-point as whole. In such code

we will represent in

f all detected cubes with homogeneous ﬁlling k.

Lemma 1. At linear search of points S, the transition from one m-point to another

occurs at zeroing of all low-oder digits in 2

-ary positional representation of a current

point.

The proof follows immediately from the way of construction of the list

Obviously, the list

f is ﬁlled at one pass of the scale S. For such ﬁlling it is enough

to watch the moments of simultaneous zeroing of all 2

-ary digits, lower than m.

The received description of segments

f already has a structure of type

. How-

ever, this representation is not the best because it does not meet yet the requirements

of the Section 1. In particular, it is redundant, because some n-dimensional cubes cor-

responding to different m-points of positional representation can be united in common

hyper-parallelepiped, and they can receive therefore a representation in the form of a

single logical regularity

. Two cubes of identical size we will name adjacent if they

adjoin on the common N −1 -dimensional edge. A N −1-dimensional edge, orthogonal

to axes n, we will name as a n-edge. We will use the following criterion of association

of pair of m-points belonging to the segment k.

Lemma 2. Two m-points C1, C2 are adjacent on a n-edge iff: 1) there is a m

′

-point

C such that m

′

> m; 2) record on C1 precedes record on C2 in

f; 3) in binary record

of all 2

-ary digits m, m + 1, ..., m

′

− 1 bits with number n have in record for C1 (C2)

from the list

f values 1 (accordingly, 0); 4) all other bits in records for C1, C2 in

coincide.

The proof. If there is a digit m

′

> m with the speciﬁed properties then volumes of

cubes C1, C2 coincide, and in corresponding m

′

-cube C all m-points, in binary bits

of 2

-ary digits of which n-th bits can vary only, make a uniform hyper-parallelepiped

of length 2

′

−1)

. As Z-scanning is used, C1 takes a highest position in the m

′

−

1-sub-cube, and C2 takes a lowest position in some other m

′

− 1-sub-cube. But C1

precedes C2 in the list

f, and therefore, they are allocated in the middle of the hyper-

parallelepiped and consequently are adjacent.

On the opposite. Any two sub-cubes of equal volume C1, C2 from D

are allocated

on S without intersections, therefore, one of them is placed in

f earlier. Let it be C1.

If C1, C2 are adjacent on n-edge, there is minimum m

′

such that some m

′

-cube C

contains C1, C2 , but any m

′

− 1-sub-cube does not contain them together. Therefore,

each of them belongs to unique own m

′

− 1-sub-cube. But C1, C2 are adjacent, and

those should be own m

′

− 1-sub-cubes, as any two sub-cubes of identical dimension

in positional representation either are adjacent, or are not intersected. Repetition of

this reasoning for all 2

-ary digits up to m allows to conclude that all bits in binary

representationof all 2

-ary digits m, m+1, , m

′

−1 in records for C1, C2 in

f coincide,

102

excepting bits with number n. It is obvious that change of value of n-th binary bit in

any 2

-ary digit from 0 to 1 results in moving corresponding sub-cube along the axis n

(and backward, at change 1 on 0). For C1 within its own m

′

− 1-sub-cube (and within

any sub-cube of smaller size up to m) it is impossible to make such change. It means

that in record for C1, C2 in the list

f all these bits have values 1 (accordingly, 0 for

C2). The Lemma 2 is proved.

Fig.1. An example in case N = 2, K = 3, d = 3. Adjacency of two gray 1-points (k = 1) is

revealed for 1-edge by considering left-bottom 3-point; adjacency of two black 1-points (k = 2)

is revealed for 1-edge by considering the whole grid (4-point); adjacency of two light-gray 2-

points (k = 3) is revealed for 2-edge by considering right-bottom 3-point.

Using the given criterion, it is possible to build hyper-parallelepipedsof the increas-

ing size. Gathering hyper-parallelepipeds of maximal volume results in exact descrip-

tion of segments of the sample in terms of LR and of sort

4 Discussion

It is necessary to remind that this construction depends essentially on the properties

of the grid of digitization D

. In particular, the value D

= 2

can serve

as natural estimate of length of the scale S that makes the approach applicable to data

processing on PC. Though bit conversions are performed efﬁciently, the representation

received on this way can differ considerably from the solutions found by more exact

(and more slow) methods. The reason is that some features of boundaries of the seg-

ments cannot be represented adequately on a discrete grid. Nevertheless, if in the exact

solution there is found some LR

, the presented method reveals some logical regularity

′

as its correspondent on the grid, and vice-versa. On the basis of all told above

and proved in Lemmas 1, 2 it is possible to assert that LR

and LR

′

will differ from

each other only in parameters A

, B

, and no more, than on the value of the step of

digitization (and cuts of empty regions on boundaries of the grid D

103

5 Conclusions

A new approach is developed for the task of mining new knowledge from the data sam-

ple in the form of logical regularities. The main attention is paid to the search of max-

imal logical regularities, that are valid for essential shares of the sample. The approach

is based on usage of the fast bit operations providing conversion of data in positional

representation and search of blocks in it. In such representation all sample elements

receive hierarchically ordered positions on the common linear scale where the main re-

lations of closeness are preserved. A new method is developed for description of the

structure of classes in the form of restricted list where only important elements from

various levels of hierarchy are present. The criterion of adjacency for pares of such el-

ements is developed and used further in procedure of combining these elements into

hyper-parallelepipeds of maximal size which correspond to the most essential logical

regularities.

The approach can be used as preliminary step in solution of various tasks related

to the search of logical regularities and substantial interpretation of data. The approach

can ﬁnd application in data processing and mining in 2D and 3D, and also in the search

of logical regularities in abstract feature space of higher dimensions.

Acknowledgements

This work was done in the framework of Joint project of the National Academy of

Sciences of Ukraine and the Russian Foundation for Basic Research No 08-01-90427

’Methods of automatic intellectual data analysis in tasks of recognition objects with

complex relations’.

References

1. Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried Schwarzkopf: Computational

Geometry (2nd revised ed.). Springer-Verlag (2000) 506 pages.

2. T. Lindeberg: Scale-Space Theory in Computer Vision. Kluwer Academic Publishers (1994)

440 pages.

3. Aleksandrov V.V., Gorskiy N.D.: Algoritmy i programmy strukturnogo metoda obrabotki dan-

nykh. L. Nauka (1983) 208 str. (russian)

4. Yu.I.Zhuravlev, V.V.Ryazanov, O.V.Senko: RASPOZNAVANIE. Matematicheskie metody.

Programmnaya sistema. Prakticheskie primeneniya. Izdatelstvo ”FAZIS”, Moscow (2006)

168 str. (russian)

5. Ryazanov V.V.: Logicheskie zakonomernosti v zadachakh raspoznavaniya (parametricheskiy

podkhod). Zhurnal vychislitelnoy matematiki i matematicheskoy ﬁziki, T. 47/10 (2007) str.

1793-1809. (russian)

104