A NEW SOFTWARE TOOL FOR MANAGING AND QUERYING

THE PERSONAL MEDICAL DIGITAL IMAGERY

Liana Stanescu, Dumitru Dan Burdescu, Marius Brezovan, Cosmin Stoica Spahiu and Anca Ion

University of Craiova, Faculty of Automation, Computers and Electronics, Craiova, Romania

Keywords: Database management system, content-based visual query, color and texture features, medical image

database.

Abstract: The paper presents an original software tool for creating, updating and querying medium sized digital

multimedia collections. The software tool represents a relational database management system kernel that

uses traditional data types (numbers and character strings) and Image data type to manage visual

information. An element of originality is the graphical interface that allows building content-based visual

queries using color and texture characteristics. These characteristics are automatically extracted from

images when they are inserted in the database. The color information is represented by color histograms

resulted by transforming the RGB color space into HSV color space and quantifying it to 166 colors. The

texture information is represented by a vector with 12 values resulted from the method that uses Gabor

filters. The software tool has the advantage of being platform independent, it has a low cost and it is easy to

use by the medical personnel. It is ideal for managing personal multimedia digital collections from medical

domain.

1 INTRODUCTION

The private medical consulting rooms represent

important components in the medical system. Many

of them use medical devices (echograph, endoscope,

MRI) to help establishing fast the correct diagnosis.

Yearly, they produce thousands of images. That is

why the problem of storing the medical image

collections in digital format, the associated

information (patient name, diagnosis, consulting

date and treatment), managing the database and

executing efficient queries is intensely studied for

finding new and more efficient solutions (Muller et

al, 2004, Muller et al, 2005).

A database is created and updated, mainly for

using in the query process. One type of query

process is the classical one (for example simple text

based query). But, for the digital multimedia

collection a different type of query should be used:

content-based visual query at image level or region

level (Del Bimbo, 2001, Faloutsos, 2005, Kalipsiz,

2000, Khoshafian and Baker, 1996). In the first

case, the doctor selects a medical image (query

image) and searches in the database all the images

similar with it and the associated information

(diagnosis, treatment). In the second case, the query

needs selection of one or several regions in an image

and searches in the database all the images that

contain selected regions. This type of content-based

query is built using visual characteristics (color,

texture or color regions) that are automatically

extracted from medical images when they are

inserted in the database (Del Bimbo, 2001, Smith,

1997). Keywords or other alphanumerical

information are not used. This query can be very

useful in the diagnosis process or in the medical

research process. For example, the doctor can find

similar images with the query image or he can see

the evolution of the diagnosis for a patient. He can

also find similar images with the query image, but

with different diagnosis (Muller et al, 2004, Muller

et al, 2005).

In order to manage content-based retrieval for

medical image collections a series of applications

that use traditional database management systems

(MS SQL Server, My SQL, Interbase) have been

implemented. The complete solution is provided by

Oracle - the Oracle 10g database server and

Intermedia tool that can manage all kind of

multimedia data, including DICOM files. This kind

of solution involves high costs for buying the

database server and for designing and implementing

complex applications for content-based visual query

199

Stanescu L., Burdescu D., Brezovan M., Stoica Spahiu C. and Ion A. (2009).

A NEW SOFTWARE TOOL FOR MANAGING AND QUERYING THE PERSONAL MEDICAL DIGITAL IMAGERY.

In Proceedings of the International Conference on Health Informatics, pages 199-204

DOI: 10.5220/0001538801990204

 SciTePress

(Chigrik, 2007, Kratochvil, 2005, Oracle, 2005).

This paper presents a less expensive database

management system (DBMS) based on the relational

model. The DBMS is platform independent and can

easily manage medium sized image collections and

alphanumerical information from the medical

domain. It has a visual interface for building content

based retrieval using color and texture

characteristics and can be easily used by any person

working in this area, even if he does not have

advanced knowledge in using the computer.

Section 2 presents the internal organization of

data in the database, sections 3, 4, 5, 6 present the

main functions of the DBMS, section 7 introduces

some experimental results and section 8 presents the

conclusions.

2 DATA ORGANIZATION

IN THE DATABASE

MANAGEMENT SYSTEM

In this section we describe the information

organization in the DBMS.

In the application main folder there is a

Database folder, automatically created. This is the

place where every new database folder will be

created.

Each table in the database is represented by a file

with “.tbl” extension stored in corresponding

database folder. The file has two components:

− A header – is created in the design phase

− Data area – is updated when executing

traditional operations of Insert, Update and

Delete

The header structure is made of:

− The number of records in table header (in the

header there will be a record for each column

in table, a record for primary key, and a record

for each external key defined in the table).

− The size of each record from the header (a

header record has information about a column

of the table: name, type, length – in case of

char strings; it can also store information

about primary or foreign key/keys).

− The header records.

The DBMS has three types of data: int, char (fixed

length strings) and image:

− The information about a char string column

type is stored as following:

Table_column_name [blank] char [blank]

no_of_characters

− The information about a int column type is

stored as following:

Table_column_name[blank]int

− The information about a image column type is

stored as following :

Table_column_name[blank]image

For the Image data type, in the data area the

following attributes are stored:

− Image type (bmp, jpg or gif)

− Image height and width

− Number of bytes needed to store the image

− The image in binary

− 166 integer values, representing the color

histogram

− 12 integer values, representing the texture

vector.

A series of methods frequently used in the medical

domain are also implemented for the Image data

type: rotating, zooming, pseudo-colors, the

similarity distance between two images, a thumbnail

representation, etc.

We describe below the methods used for

extracting color and texture information from an

image and the reason why they where chosen.

The color space used for representing color

information in an image has a great importance in

content-based image query, so this direction of

research was intensely studied (Del Bimbo, 2001).

There is no color system that it is universally

used, because the notion of color can be modeled

and interpreted in different ways (Gevers, 2004).

Several color spaces were created for different

purposes (Gevers, 2001, Gevers, 2004). The color

systems were studied taking into consideration

different criteria imposed by content-based visual

query (Gevers, 1999). The experiments show that

the HSV color system has the following properties

(Gevers, 2004): it is close to the human perception

of colors; it is intuitive; it is invariant to illumination

intensity and camera direction. However, the HSV

color space has several problems (Gevers, 2004):

nonlinear (but still simple) transformation from

RGB to HSV; device dependent; the H component

becomes instable when S is close to 0; the H

component is dependent of the illumination color.

The studies made on nature and medical images

have shown that in the case of the HSV, RGB, l1l2l3

and CieLuv color systems, the HSV color space

produces the best results in content-based retrieval

(Gevers, 1999, Gevers, 2001, Gevers, 2004, Smith,

1997, Stanescu et al, 2006).

The operation of color system quantization is

needed in order to reduce the number of colors used

HEALTHINF 2009 - International Conference on Health Informatics

200

in content-based visual query. The quantization of

the HSV color space to 166 colors, solution

proposed by J.R. Smith, is the idea used in this

multimedia DBMS (Smith, 1997), having as result

the color histogram, which is memorized together

with the image in the data area of the file.

Together with color, texture is a powerful

characteristic of an image, existent in nature and

medical images, where a disease can be indicated by

changes in the color and texture of a tissue. A series

of methods have been studied to extract texture

features (Del Bimbo, 2001), but there is not a certain

method that can be considered the most appropriate,

this depending on the application and the type of

images taken into account.

Among the most representative methods of

texture detection is the one that uses Gabor filters.

This is why it is used in this multimedia DBMS for

determining the texture vector.

The color space HSV is a non-linear

transformation of the RGB color space. The H, S, V

components closely correspond to the human color

perceptions.

Starting from the representation of the HSV

color space, the color can be represented in complex

domain (Palm et al, 2000).

The affix of any point from the cone base can be

computed as:

= S (cos H + i sin H)

(1)

Therefore, the saturation is interpreted as the

magnitude and the hue as the phase of the complex

value b; the value channel is not included. The

advantages of this representation of complex color

are: the simplicity due to the fact that the color is

now a scalar and not a vector and the combination

between channels is done before filtering.

So, the color can be represented in complex

domain (Palm et al, 2000):

y)iH(x,

ey)S(x,y)b(x, ⋅=

(2)

The computation of the Gabor characteristics for

the image represented in the HS-complex space is

similar to the one for the monochromatic Gabor

characteristics, because the combination of color

channels is done before filtering:

(3)

The Gabor characteristics vector is created using

the value

computed for 3 scales and 4

orientations:

f = (C

0,0

, C

0,1

, … C

2,3

)

(4)

This texture vector with 12 values is also stored in

the Image type field.

3 DATABASE MANAGEMENT

If the user wants to create a new database, he must

provide its name in the dialog menu. If there is

another database with the same name, the operation

is cancelled and the user is notified. A new folder is

created for each database, in the Database folder of

the application. After a database was created, it will

be listed on the tree in the left side of the main

window. This tree is used for viewing all the

databases in the system, including their tables. After

creating the database, the user might go further to

create the tables.

For deleting a database, first it should be selected

in the tree on the left side of the window. The user

has to confirm operation before deleting. The entire

database will be deleted, including the folder and the

attached files.

4 TABLE MANAGEMENT

To create a table, the user must select the database

where he wants to put it and to specify the name of

the table, the columns, primary and external keys if

any. The names of the tables in a database are

unique. For each table it will be created a new file

with a specific structure (a header area and a

recording area). This file is created in the database

folder having the name of the table and the .tbl

extension. The user has to specify the structure of

the table: the columns, data type, size, applicable

constrains. The name of a column also has to be

unique. This aspect is ensured by DBMS.

Three types of data are implemented: int, char

and image. For the fixed length char strings the user

specifies the maximum size. The database kernel

introduces a new type of data – Image. It allows

storing in database an image having one of the

following formats: bmp, gif or jpg.

When creating a new table, the user may also

specify the primary key. It can include one or

several columns.

The user may specify a 1:m connection between

two tables: a parent table (on the 1 side of the

connection) and a son (on the m side of the

connection). For this it must be used the Foreign

v)}))(u,

Mv){P(u,

FFT((

ϕϕ

⋅

−

∑

A NEW SOFTWARE TOOL FOR MANAGING AND QUERYING THE PERSONAL MEDICAL DIGITAL IMAGERY

201

Keys tag, in the same window. The user can easily

select the parent and son table and the foreign key

(figure 1). The primary key and the foreign key must

have the same type and the same size. If there is a

connection between two primary keys, the

connection will be 1:1. The structure of the table

might be seen at any moment in the main window of

the DBMS, using the Components tag.

Once the table is created, we can add new

records, modify or delete existing ones, using the

record editor.

5 UPDATING TABLE DATA

The user can add a new record only if the previous

record was correctly added and saved in the

corresponding file of the table. A record is correct if

all the fields are filled with information having the

type described in the structure. If one of the fields of

the table has the type image, when inserting process

is started, the “Choose the image” dialog window is

opened to permit the user to choose the image he

wants to add.

Figure 1: The window for establishing the relationship

between tables.

At this moment the algorithms for pre-processing the

image will be called:

− Transforming from RGB space to HSV space

− 166 color quantification

− 12 values texture extraction using Gabor filters

The two characteristics vectors and the image are

then binary stored.

As we may see, the main window of the application

contains two parts: on the left the tree containing the

database structure is listed, and on the right is listed

the table structure (Components tag) or data

information (Data tag). An element of originality is

that when seeing the records in the database, the

image is directly viewed (figure 2).

Figure 2: The data in the selected table showed by the

Data tag.

Figure 3: The window that allows the building of the

content-based image query.

6 CONTENT-BASED VISUAL

QUERY

The database management system kernel offers the

possibility to build the content-based visual query, in

an easy manner, at the image level, using the menus

presented in the software window from figure 3. The

elements of this window are:

− Similar With – opens the window for choosing

the query image

HEALTHINF 2009 - International Conference on Health Informatics

202

− Select – permits to choose the field/fields that

will be presented in the results of the query

− From – it is one of the tables in database, that

will be used for the query

− Where – the image type column used for

content-based image query

− Features – it is chosen the characteristic used

for content based visual query – color, texture

or a combination of them

− Maximum images – specify the maximum

number of images returned by the query.

When building the query, it is actually built a

modified SQL Select command, adapted for content-

based image query. For example:

Select patients.diagnosis, patients.img

From Patients where Patients.img

Similar with Query Image (method:

color, max.images 5)

This modified Select command specifies that the

results are obtained from Patients table, taking into

consideration the values from diagnosis field, the

images similar with the query image for color

characteristic, and there will be 5 resulting images.

In the resulting set it is also presented the distance of

the dissimilarity between query image and target

image. In fact this modified command is very

suggestive for the users (medical personnel).

7 EXPERIMENTS

The database management system kernel was tested

from two points of view: the quality of the content-

based visual query process and the execution speed

in a private consulting room specialised in internal

medicine. For determination of the query process

quality, the experiments were performed in the

following conditions. It was created a database with

2036 colour images from the digestive area. The

images were taken from patients having the

following diagnoses: polyps, ulcer, esophagites,

colitis and ulcerous tumour. For each patient there

are more images of the same ill area, made from

different angles.

For each query, the relevant images have been

established. Each of the relevant images has become

in its turn a query image, and the final results for a

query are an average of these individual results

(table 1). These experiments have considered the

colour and texture attributes of the medical images,

each of them having equal weights (0.5). The

motivation of this choice is bound by the nature of

medical images from digestive area with different

diagnosis that generates changes both in colour and

texture of the ill tissue.

Table 1: The content-based visual query experimental

results.

Query

Nr. of relevant

images

r. of relevant

images retrieved

in the first 5

Polyps 1008 4

Colitis 288 3

Ulce

540 3

Ulcerous

Tumo

80 2

Esophagitis 120 3

In figure 4 there are two examples of content-

based image query for images categorized as colitis

(first column) and ulcer (the secondary column). In

both examples there are 4 relevant images in the first

5 retrieved images in the content-based visual query

process.

Irrelevant irrelevant

Figure 4: Some results of the content-based visual query

process.

A NEW SOFTWARE TOOL FOR MANAGING AND QUERYING THE PERSONAL MEDICAL DIGITAL IMAGERY

203

8 CONCLUSIONS AND FUTURE

WORK

In the paper there are presented the organization and

the functions of the implemented multimedia,

relational, single-user DBMS kernel. It is created for

managing and querying medium sized personal

digital collections that contain both alphanumerical

information and digital images (for examples the

ones used in private medical consulting rooms). The

software tool allows creating and deleting databases,

creating and deleting tables in databases, updating

data in tables and querying. The user can utilize

three types of data: int, char and image. There are

also implemented the two constraints used in

relational model: primary key and referential

integrity.

The software tool can execute both simple text

based query using one or several criteria connected

with logical operators (and, or, not) and content

based visual query at image level, taking into

consideration the color and texture characteristics.

These characteristics are automatically extracted

when the images are inserted in the database. The

visual manner of building this type of query specific

for multimedia data and the modified Select

command that is sent for execution to the DBMS

give originality to the software product. All the

functions of the software tool might be easily used

by persons that works in other domains not linked to

computers (for example medical domain that use

visual information). For implementation the Java

technology, which gives the platform more

independence was used.

REFERENCES

Chigrik, A., 2007. SQL Server 2000 vs Oracle 9i

http://www.mssqlcity.com/Articles/Compare/sql_server_v

s_oracle.htm

Del Bimbo, A., 2001. Visual Information Retrieval,

Morgan Kaufmann Publishers. San Francisco USA

Faloutsos, C., 2005. Searching Multimedia Databases by

Content. Springer

Gevers, T., Smeulders, W.M., 1999. Color-based object

recognition. Pattern Recognition. 32, 453-464

Gevers, T., 2001. Color in Image Search Engines.

Principles of Visual Information Retrieval. Springer-

Verlag, London

Gevers, T., 2004. Image Search Engines: An Overview.

Emerging Topics in Computer Vision. Prentice Hall

Kalipsiz, O., 2000. Multimedia databases. In: Proceedings

of the IEEE International Conference on Information

Visualization

Khoshafian, S., Baker, A.B., 1996. Multimedia and

Imaging Databases. Morgan Kaufmann Publishers,

Inc. San Francisco California

Kratochvil, M., 2005. The Move to Store Images In the

Database

http://www.oracle.com/technology/products/intermedia/pd

f/why_images_in_database.pdf

Muller, H., Michoux, N., Bandon, D., Geissbuhler, A.,

2004. A Review of Content_based Image Retrieval

Systems in Medical Application – Clinical Benefits

and Future Directions. Int J Med Inform. 73

Muller, H., Rosset, A., Garcia, A., Vallee, J.P.,

Geissbuhler, A., 2005. Benefits of Content-based

Visual Data Access in Radiology. Radio Graphics. 25,

849-858

Oracle, 2005. Oracle InterMedia: Managing Multimedia

Content

http://www.oracle.com/technology/products/intermedia/pd

f/10gr2_collateral/imedia_twp_10gr2.pdf

Palm, C., Keysers, D., Lehmann, T., Spitzer, K., 2000.

Gabor Filtering of Complex Hue/Saturation Images

For Color Texture Classification. In: Proc. 5

Joint

Conference on Onformation Science (JCIS2000) 2.

Atlantic City, USA 45-49

Smith, J.R., 1997. Integrated Spatial and Feature Image

Systems: Retrieval, Compression and Analysis. Ph.D.

thesis, Graduate School of Arts and Sciences.

Columbia University

Stanescu, L., Burdescu, D.D., Ion, A., Brezovan, M.,

2006. Content-Based Image Query on Color Feature in

the Image Databases Obtained from DICOM Files. In:

International Multi-Conference on Computing in the

Global Information Technology. Bucharest. Romania

HEALTHINF 2009 - International Conference on Health Informatics

204