Ontology-based Framework to Image Mining

Sara Colantonio

, I. Gurevich

, Gabriele Pieri

, Ovidio Salvetti

and Yulia Trusova

Institute of Information Science and Technologies, Italian National Research Council

via G. Moruzzi 1, 56124 Pisa, Italy

Dorodnicyn Computing Centre of the Russian Academy of Sciences,

40 Vavilov str., 119333 Moscow, Russian Federation

Abstract. A novel knowledge-based approach for supporting image processing

and analysis is presented as well as its use within a framework for image

mining. Modern approaches to knowledge representation, ontologies and

reasoning, have been combined with techniques for image processing, analysis

and understanding within a semantic framework able to support the extraction

of novel knowledge for image collections.

1 Introduction

Due to the pervasive diffusion of imagery data and their central role in many key

problems of socially and industrially relevant domains, the need for automated

applications able to support image analysis tasks has been attracting and absorbing

the increasing interest and effort of the research community for the last decades.

Furthermore, the possibility of using large image collections to extract novel, relevant

and significant knowledge for solving specific tasks has demonstrated to assure an

even higher added-value to image processing applications.

Usually, image processing (IP) specialists address each specific problem they are

asked to solve by wisely integrating their knowledge about image processing and

analysis techniques with the necessary domain knowledge, acquired by elicitation

from domain experts and the analysis of all the processes related to image formation,

acquisition and interpretation. Once understood the problem, IP expertise is employed

in finding out the most suitable techniques that apply to the kind of images and

problem at hand. This usually results in a multi-step procedure devoted to solve

commonly identified sub-problems, which correspond to main IP issues such as

image enhancement, relevant structures extraction and analysis, content

categorization and interpretation. The results of this IP chain can be passed as input

for an image mining process for being employed into a virtuous loop of knowledge

representation and extraction.

In the last years, a big effort has been spent for defining general-purpose

computerized applications able to interpret automatically image content, but very

Colantonio S., Gurevich I., Pieri G., Salvetti O. and Trusova Y. (2009).

Ontology-based Framework to Image Mining.

In Proceedings of the 2nd International Workshop on Image Mining Theory and Applications, pages 11-19

DOI: 10.5220/0001962600110019

 SciTePress

little has been done for aiding at high level the development of IP applications by

systematically defining IP processes and their recording for re-use, evaluation and

integration. Indeed, a formal and sound description of IP algorithms can help building

applications able to support non-expert users in the choice of the correct algorithms

and/or procedures to apply to their particular image instance.

Actually, there are a number of reasons why a clear, formal description of

processes, algorithms or methods applied to images can be useful if not necessary.

More precisely, clear definitions of algorithms, with explicit references to the

problem they solve, the data they manipulate and the parameters they require, can be

helpful for building:

• a library or catalogue of IP algorithms suitable for re-use by storage, retrieve

and sharing mechanisms, e.g., the formal definition can be easily exploited for

automated retrieval of algorithms that satisfy expressed requirements;

• a repository of developed procedures, with the corresponding addressed

problems, which can be used as references for similar cases, e.g., via case-

based reasoning;

• a framework able to support the development of IP applications by suggesting

the most suitable algorithms for solving a specific problem. Suggestions can

be obtained by reasoning on both the syntactic (e.g., input types and

parameters) and semantic features (e.g., constraints and requirements or high-

level description of the results);

• a framework for knowledge extraction able to integrate a library of data

mining algorithms tuned on image applications

In the most complex visions, algorithms and information about their applications

should be maintained in an appropriate Knowledge Base, which formalizes the

expertise of IP domain

Ontologies have emerged in the last years as a knowledge representation

formalism Ontologies specify reusable conceptualizations which can be shared by

multiple reasoning components communicating during a problem solving process.

So far, a variety of methodologies and algorithmic resources have been designed

and developed to solve particular tasks, focusing on the specific application problem,

but attempts to standardize different approaches and methodologies are still rare.

In this paper, an ontology-based framework to image analysis is discussed and its

extension to address image mining tasks is discussed. The approach combines

techniques for image processing, analysis and understanding with modern approaches

to knowledge representation, ontologies and reasoning, to support intellectual

decision making in image understanding tasks. The paper is organized as follows.

Section 2 overviews works devoted to the usage of ontologies for solving image-

based tasks. Section 3 presents basic ideas of the ontology-based approach to image

mining. In Section 4 the description of the ontology on image analysis is presented. In

Conclusion the directions of future work are discussed.

2 Related Works

Ontologies as an effective way for knowledge representation became very popular

last years. Different works related to usage of ontologies for solving image-based

tasks have been reported. For example, in [3] an approach for solving the symbol

grounding problem involved in semantic image interpretation is presented. The

method is based on using the image processing ontology to reduce the gap between

the image processing level and the visual level. Authors note that the proposed

ontology is not complete and should be considered as a basis for further extension. In

[5] a platform dedicated to the knowledge extraction and management for image

processing applications is proposed. It includes a system that automatically generates

image processing applications on the basis of goal formulations given by a user who

is inexperienced in image processing domain. The user defines the goal of processing

in terms of his/her application domain and then the system translates this information

into image processing terms taken from the image processing ontology. The result of

this translation is an image processing request which is sent to the planning system to

generate the program that responds to this request.

The main contribution of our work is the development of a sufficiently detailed

and well-structured ontology which will cover all important aspects of image

processing, analysis and understanding (main categories of concepts, their properties

and relations). The proposed ontology can be used as a base for the construction of

specialized knowledge bases for supporting image analysis and, then, image mining.

3 Ontology-based Image Analysis

In solving problems of image analysis, one must make complex decisions at different

levels of processing. To obtain the required solution, usually, several processes and

stages of processing should be combined. At each stage, the problem of choosing the

most appropriate method and specification of its parameters may arise.

The automation of image analysis assumes that researchers and users of different

qualifications have at their disposal not only a standardized technology of automation,

but also a system supporting this technology, which accumulates and uses knowledge

on image processing, analysis and evaluation and provides adequate structural and

functional possibilities for supporting the more intelligent choice and synthesis of

methods and algorithms.

The automated system (AS) for image analysis must provide a formal and precise

representation of the qualification of the IP specialist and include tools for emulating

choice strategies and applying known processing methods used by specialists in

solving such problems. The AS must combine the possibilities of the instrumental

environment for image processing and analysis and a knowledge-based system.

Therefore, one of its main components is a knowledge base. Knowledge bases usually

contain modules of universal knowledge, which are not related to any subject domain

(knowledge necessary for scheduling and control of the processing, result mappings,

estimation of the processing quality, object recognition, and conflict resolution, as

well as knowledge about methods of image processing and analysis) and knowledge

modules related to a certain subject domain (segmentation strategies, object

descriptions, and specialized strategies for feature extraction and object

identification). The AS must provide software implementation of the hierarchies of

classes of the main objects used in image analysis, have a specialized user interface,

contain a library of algorithms that allow one to solve the main problems of image

analysis and understanding with the help of efficient computational procedures, and

provide accumulation and structuring of knowledge and experience in the area of

image analysis and understanding. The need of efficient knowledge representation

facilities can be fulfilled by using a suite of ontologies and thesauri. Ontology-based

knowledge representation provides: 1) explicit formal description of semantics; 2)

shared understanding of a given domain; 3) re-use of knowledge. Ontologies can be

considered as a skeleton of knowledge bases for supporting image analysis. Thesauri

can help users to create requests to the AS. They can assist in choosing appropriate

keywords for specifying a goal to be achieved, data to be processed and results to be

obtained (see Fig.1). More detailed description of the proposed approach can be

found in our previous work [1].

Fig.1.Ontology-based methodology.

4 Image Analysis Ontology

The Image Analysis Ontology (IAO) is needed for solving the following tasks: 1)

construction of unified description and representation of image-based tasks and

methods for solving these tasks; 2) automation of image analysis methods

combination on the base of semantic integration; 3) automation of navigation and

retrieval in knowledge bases on image analysis.

Below the description of the current version of the IAO is presented.

4.1 Scope and Sources

The IAO is aimed at representing domain independent knowledge used for solving

image processing and analysis tasks. The IAO codifies:

• knowledge about general image-based tasks and their decomposition into

sub-tasks;

• knowledge about methods (approaches, algorithms, techniques, operators,

etc.) for image processing, analysis, recognition and understanding.

The first step of the ontology development process is to define main classes of

concepts of a given domain. As a main source of the information about concepts

(including term definitions and basic relationships between terms) the Image Analysis

Thesaurus (IAT) [2] is used. IAT is being developed at the Scientific Council

“Cybernetics” of the Russian Academy of Sciences and detailed later at the

Dorodnicyn Computing Centre of the Russian Academy of Sciences. It contains more

than 2000 terms related to image processing, analysis and recognition. The IAT

reflects a current state of a given domain. The information about new concepts is

being added regularly.

4.2 Tools and Languages

The Ontology Web Language (OWL) [6] has been chosen to build the IAO. Today

OWL is one of the most commonly used formal language for ontology description.

OWL has more facilities for expressing meaning and semantics than XML, RDF, and

RDF-S. OWL is intended to provide a language that can be used to:

• formalize a domain by defining classes and properties of those classes,

• define domains and ranges for properties,

• define individuals and assert properties about them,

• reason about these classes and individuals .

For editing the ontology we are using the Protégé ontology editor (version 3.2.1)

developed by the Stanford Medical Informatics at the Stanford University School of

Medicine [4]. The editor implements a rich set of knowledge-modeling structures and

actions that support the creation, visualization, and manipulation of ontologies in

various representation formats.

4.3 Main Classes and Class Hierarchy

The behavior of an IP expert can be efficiently described in terms of tasks to be

solved and methods for solving these tasks.

In general, image processing and analysis tasks are characterized by a final goal to

be reached, input data and requirements to a result. The formal definition of the

concept «task» is as follows.

Definition 1. Task T(G

, I

, R

, C

,) is defined by its goal G

, input data I

requirements R

and context C

, where

• goal G

– the desired result;

• I

– the description of input data;

• R

– requirements to a final result;

• context C

- any useful information.

Definition 2. Method is an algorithmic procedure or a set of algorithmic procedures

characterized by the following:

• its competence (tasks that can be solved by this method);

• input and output data;

• a set of subtasks to be solved (i.e. complex method) or an operator (primitive or

compose one) (i.e. primitive method) to be applied.

Usually, the same task can be solved by several methods.

} is a set of methods

for solving a task T(G

, I

, R

, C

,), if M

: (I

)=>G

OWL-ontologies consist of the following components: classes (of concepts),

properties of classes and individuals (instances of classes). A class defines a group of

individuals that belong together because they share some properties. Classes can be

organized in a specialization hierarchy using subClassOf. There is a built-in most

general class named Thing that is the class of all individuals and is a superclass of

all OWL classes [6].

In accordance to the definitions presented above the following IAO classes were

defined: Task, Method, Data, Context и Requirements. The hierarchy of

subclasses is based on term relations fixed in the IAT.

Current version of the IAO contains the following subclasses of the class Task:

class

BinarizationTask, class CompressionTask, class DetectionTask, class

EnhancementTask, class InterpolationTask, class MatchingTask, class

QuantizationTask, class ReconstructionTask, class RestorationTask and

class

SegmentationTask. Some of these classes, in turn, also include subclasses.

For example, class

ContrastEnhancementTask and class NoiseReductionTask

are subclasses of the class

EnhancementTask. The current version of the IAO

contains 24 subclasses of the class Task.

It should be noted, that the proposed task hierarchy is a preliminary one. It requires

more detailed investigation with involving of experts on every specific subsection of

the domain, for example, experts in image compression, image segmentation, etc.

The hierarchy of Method subclasses classifies different types of methods in

accordance with a task they solve. For example, the class

SegmentationTask has

the corresponding class

SegmentationMethod, which describes existing methods

for image segmentation.

The class

Data includes the following subclasses: class Image, class ImagePart

and class

ImageSequence.

The class

Context includes the following 6 subclasses: class

AcquisitionContext (context related to image acquisition, for example, camera

type and location, acquisition date, etc.), class

ApplicationContext (context

describing a subject domain of a task, for example, biology, medicine, etc.), class

FunctionalContext (context describing an application of results, for example,

diagnostics in the case of a medical task), class

ObjectFeaturesContext (context

related to the description of image objects, for example, geometrical object

characteristics, object location, etc.), class PhysicalContext (context describing

technical characteristics of an image to be processed, for example, image format,

image illumination, image quality, etc.) and class

ProblemTypesContext (context

describing a type of a task, for example, analysis, processing, recognition,

understanding or reducing an image to a recognizable form). This list of subclasses of

the class

Context is open. New subclasses can be added in the future if it will be

needed.

The class

Requirements includes the following 2 subclasses: class

PerformanceCriteria, which includes the following 2 subclasses: class

AlgorithmPerformanceRequirements (algorithm performance requirements, for

example, calculation accuracy, computational complexity, etc.) and

TechnicalRequirements (technical requirements, for example, CPU

characteristics, platform type, etc.) and class

QualityRequirements, which

describes result quality requirements.

Fig.2 shows relations between main IAO classes.

Fig.2. Main IAO classes and relations between them.

4.4 Properties of Classes

OWL-properties is characteristics of classes. Properties can be used to state

relationships between individuals (owl:ObjectProperty) or from individuals to data

values (owl:DatatypeProperty). Property hierarchies may be created by making one

or more statements that a property is a subproperty of one or more other properties. A

property has a domain (rdfs:domain) and a range (rdfs:range). A domain of a

property limits the individuals to which the property can be applied. The range of a

property limits the individuals that the property may have as its value. Properties may

be of the following types: inverse, transitive, symmetric, functional or inverse

functional. Table 1 shows some examples of different IAO properties.

Table 1. Examples of IAO properties.

rdf:Property rdfs: domain rdfs: range Allowed values Examples

is_solved_by

SmoothingTask SmoothingMethod

Instances

num_of_bits

Image

Integer 1,4,8,16,24, ... 8-bit image

edge_type

Edge

String “roof”, “step”,.. step edge

linearity

ImageFilter

Boolean true, false linear filter

Let us consider the property is_solved_by (see Table). The property is an example

of owl:

ObjectProperty property. Its domain is the class SmoothingTask, its range is

the class

SmoothingMethod. The property has an inverse property is_appled_for.

Other properties listed in the Table are examples of owl:

DatatypeProperty properties.

In addition to specific properties, all defined IAO classes have standard OWL-

properties such as - rdfs:comment and rdfs:label. The former property has value in a

form of concept definitions extracted from the IAT while the latter property has value

in a form of concept names (terms) extracted from the IAT as well.

5 Conclusions and Future Work

An ontology-based approach to image analysis has been presented. The description of

the ontology on image analysis has been presented. It is important to note, that the

ontology is not completed. It requires more detailed investigation of the given

domain. We are planning to revise and refine the proposed ontology to extend its

applicability by mean of introducing more precise information on tasks and methods.

The work on the ontology opens a straightforward direction for the development

of an integrated and advanced framework for mining new information from large

collections of images. By integrating the ontology with algorithms for image

representation and understanding, high-level semantic information can be extracted

from images and data mining algorithms can be applied to it for obtaining novel

knowledge about specific domain problems. Such an integration is under design and

will be the subject of future research.

Acknowledgements

This work was partially supported by the Russian Foundation for Basic Research

(projects nos. 07-07-13545, 08-01-90022), by the Program of the Presidium of the

RAS “Intelligent information technologies, mathematical modeling, system analysis

and automation”, by the Foundation for Assistance to Small Innovative Enterprises

(contract №5639р/8067), and by the European Community, under the Sixth

Framework Programme, Information Society Technology – ICT for Health, within

the STREP project HEARTFAID (IST-2005-027107), 2006-2009.

References

1. Asirelli, P., Colantonio, S., Gurevich, I., Martinelli, M., Salvetti, O., Trusova, Yu.:

Ontology Driven Approach to Image Understanding. In: 8th International Conference

“Pattern Recognition and Image Analysis: New Information Technologies (PRIA-8-2007),

Vol.1 (2007) 67-71

2. Beloozerov, V.N., Gurevich, I.B., Gurevich, N.G., Murashov, D.M., Trusova, Yu.O.:

Thesaurus for Image Analysis: Basic Version. Int. J. Pattern Recognition and Image

Analysis: Advances in Mathematical Theory and Applications. MAIK

"Nauka/Interperiodica", Moscow 13 (4) (2003) 556-569

3. Hudelot, C., Maillot, N., Thonnat, M.: Symbol Grounding for Semantic Image

Interpretation: From Image Data to Semantics. In: 10th IEEE International Conference on

Computer Vision (ICCVW'05) (2005) 1875

4. Protégé Ontology Editor. Available at: http://protege.stanford.edu/

5. Renouf, A., Clouard, R., Revenu, M.: A Platform dedicated to knowledge engineering for

the development of image processing applications. In: Proceedings of the ICEIS 2007, Vol.

AIDSS, Funchal, Portugal (2007) 271-276

6. Smith, M.K., Welty, C., McGuinness, D.L. (Eds.): OWL Web Ontology Language Guide,

W3C Recommendation, 10 February 2004, Available at: http://