A NEW SOFTWARE TOOL FOR MANAGING AND QUERYING
THE PERSONAL MEDICAL DIGITAL IMAGERY
Liana Stanescu, Dumitru Dan Burdescu, Marius Brezovan, Cosmin Stoica Spahiu and Anca Ion
University of Craiova, Faculty of Automation, Computers and Electronics, Craiova, Romania
Keywords: Database management system, content-based visual query, color and texture features, medical image
database.
Abstract: The paper presents an original software tool for creating, updating and querying medium sized digital
multimedia collections. The software tool represents a relational database management system kernel that
uses traditional data types (numbers and character strings) and Image data type to manage visual
information. An element of originality is the graphical interface that allows building content-based visual
queries using color and texture characteristics. These characteristics are automatically extracted from
images when they are inserted in the database. The color information is represented by color histograms
resulted by transforming the RGB color space into HSV color space and quantifying it to 166 colors. The
texture information is represented by a vector with 12 values resulted from the method that uses Gabor
filters. The software tool has the advantage of being platform independent, it has a low cost and it is easy to
use by the medical personnel. It is ideal for managing personal multimedia digital collections from medical
domain.
1 INTRODUCTION
The private medical consulting rooms represent
important components in the medical system. Many
of them use medical devices (echograph, endoscope,
MRI) to help establishing fast the correct diagnosis.
Yearly, they produce thousands of images. That is
why the problem of storing the medical image
collections in digital format, the associated
information (patient name, diagnosis, consulting
date and treatment), managing the database and
executing efficient queries is intensely studied for
finding new and more efficient solutions (Muller et
al, 2004, Muller et al, 2005).
A database is created and updated, mainly for
using in the query process. One type of query
process is the classical one (for example simple text
based query). But, for the digital multimedia
collection a different type of query should be used:
content-based visual query at image level or region
level (Del Bimbo, 2001, Faloutsos, 2005, Kalipsiz,
2000, Khoshafian and Baker, 1996). In the first
case, the doctor selects a medical image (query
image) and searches in the database all the images
similar with it and the associated information
(diagnosis, treatment). In the second case, the query
needs selection of one or several regions in an image
and searches in the database all the images that
contain selected regions. This type of content-based
query is built using visual characteristics (color,
texture or color regions) that are automatically
extracted from medical images when they are
inserted in the database (Del Bimbo, 2001, Smith,
1997). Keywords or other alphanumerical
information are not used. This query can be very
useful in the diagnosis process or in the medical
research process. For example, the doctor can find
similar images with the query image or he can see
the evolution of the diagnosis for a patient. He can
also find similar images with the query image, but
with different diagnosis (Muller et al, 2004, Muller
et al, 2005).
In order to manage content-based retrieval for
medical image collections a series of applications
that use traditional database management systems
(MS SQL Server, My SQL, Interbase) have been
implemented. The complete solution is provided by
Oracle - the Oracle 10g database server and
Intermedia tool that can manage all kind of
multimedia data, including DICOM files. This kind
of solution involves high costs for buying the
database server and for designing and implementing
complex applications for content-based visual query
199
Stanescu L., Burdescu D., Brezovan M., Stoica Spahiu C. and Ion A. (2009).
A NEW SOFTWARE TOOL FOR MANAGING AND QUERYING THE PERSONAL MEDICAL DIGITAL IMAGERY.
In Proceedings of the International Conference on Health Informatics, pages 199-204
DOI: 10.5220/0001538801990204
Copyright
c
SciTePress
(Chigrik, 2007, Kratochvil, 2005, Oracle, 2005).
This paper presents a less expensive database
management system (DBMS) based on the relational
model. The DBMS is platform independent and can
easily manage medium sized image collections and
alphanumerical information from the medical
domain. It has a visual interface for building content
based retrieval using color and texture
characteristics and can be easily used by any person
working in this area, even if he does not have
advanced knowledge in using the computer.
Section 2 presents the internal organization of
data in the database, sections 3, 4, 5, 6 present the
main functions of the DBMS, section 7 introduces
some experimental results and section 8 presents the
conclusions.
2 DATA ORGANIZATION
IN THE DATABASE
MANAGEMENT SYSTEM
In this section we describe the information
organization in the DBMS.
In the application main folder there is a
Database folder, automatically created. This is the
place where every new database folder will be
created.
Each table in the database is represented by a file
with “.tbl” extension stored in corresponding
database folder. The file has two components:
A header – is created in the design phase
Data area – is updated when executing
traditional operations of Insert, Update and
Delete
The header structure is made of:
The number of records in table header (in the
header there will be a record for each column
in table, a record for primary key, and a record
for each external key defined in the table).
The size of each record from the header (a
header record has information about a column
of the table: name, type, length – in case of
char strings; it can also store information
about primary or foreign key/keys).
The header records.
The DBMS has three types of data: int, char (fixed
length strings) and image:
The information about a char string column
type is stored as following:
Table_column_name [blank] char [blank]
no_of_characters
The information about a int column type is
stored as following:
Table_column_name[blank]int
The information about a image column type is
stored as following :
Table_column_name[blank]image
For the Image data type, in the data area the
following attributes are stored:
Image type (bmp, jpg or gif)
Image height and width
Number of bytes needed to store the image
The image in binary
166 integer values, representing the color
histogram
12 integer values, representing the texture
vector.
A series of methods frequently used in the medical
domain are also implemented for the Image data
type: rotating, zooming, pseudo-colors, the
similarity distance between two images, a thumbnail
representation, etc.
We describe below the methods used for
extracting color and texture information from an
image and the reason why they where chosen.
The color space used for representing color
information in an image has a great importance in
content-based image query, so this direction of
research was intensely studied (Del Bimbo, 2001).
There is no color system that it is universally
used, because the notion of color can be modeled
and interpreted in different ways (Gevers, 2004).
Several color spaces were created for different
purposes (Gevers, 2001, Gevers, 2004). The color
systems were studied taking into consideration
different criteria imposed by content-based visual
query (Gevers, 1999). The experiments show that
the HSV color system has the following properties
(Gevers, 2004): it is close to the human perception
of colors; it is intuitive; it is invariant to illumination
intensity and camera direction. However, the HSV
color space has several problems (Gevers, 2004):
nonlinear (but still simple) transformation from
RGB to HSV; device dependent; the H component
becomes instable when S is close to 0; the H
component is dependent of the illumination color.
The studies made on nature and medical images
have shown that in the case of the HSV, RGB, l1l2l3
and CieLuv color systems, the HSV color space
produces the best results in content-based retrieval
(Gevers, 1999, Gevers, 2001, Gevers, 2004, Smith,
1997, Stanescu et al, 2006).
The operation of color system quantization is
needed in order to reduce the number of colors used
HEALTHINF 2009 - International Conference on Health Informatics
200
in content-based visual query. The quantization of
the HSV color space to 166 colors, solution
proposed by J.R. Smith, is the idea used in this
multimedia DBMS (Smith, 1997), having as result
the color histogram, which is memorized together
with the image in the data area of the file.
Together with color, texture is a powerful
characteristic of an image, existent in nature and
medical images, where a disease can be indicated by
changes in the color and texture of a tissue. A series
of methods have been studied to extract texture
features (Del Bimbo, 2001), but there is not a certain
method that can be considered the most appropriate,
this depending on the application and the type of
images taken into account.
Among the most representative methods of
texture detection is the one that uses Gabor filters.
This is why it is used in this multimedia DBMS for
determining the texture vector.
The color space HSV is a non-linear
transformation of the RGB color space. The H, S, V
components closely correspond to the human color
perceptions.
Starting from the representation of the HSV
color space, the color can be represented in complex
domain (Palm et al, 2000).
The affix of any point from the cone base can be
computed as:
z
M
= S (cos H + i sin H)
(1)
Therefore, the saturation is interpreted as the
magnitude and the hue as the phase of the complex
value b; the value channel is not included. The
advantages of this representation of complex color
are: the simplicity due to the fact that the color is
now a scalar and not a vector and the combination
between channels is done before filtering.
So, the color can be represented in complex
domain (Palm et al, 2000):
y)iH(x,
ey)S(x,y)b(x, =
4
(2)
The computation of the Gabor characteristics for
the image represented in the HS-complex space is
similar to the one for the monochromatic Gabor
characteristics, because the combination of color
channels is done before filtering:
(3)
The Gabor characteristics vector is created using
the value
ϕ
f,
C
computed for 3 scales and 4
orientations:
f = (C
0,0
, C
0,1
, … C
2,3
)
(4)
This texture vector with 12 values is also stored in
the Image type field.
3 DATABASE MANAGEMENT
If the user wants to create a new database, he must
provide its name in the dialog menu. If there is
another database with the same name, the operation
is cancelled and the user is notified. A new folder is
created for each database, in the Database folder of
the application. After a database was created, it will
be listed on the tree in the left side of the main
window. This tree is used for viewing all the
databases in the system, including their tables. After
creating the database, the user might go further to
create the tables.
For deleting a database, first it should be selected
in the tree on the left side of the window. The user
has to confirm operation before deleting. The entire
database will be deleted, including the folder and the
attached files.
4 TABLE MANAGEMENT
To create a table, the user must select the database
where he wants to put it and to specify the name of
the table, the columns, primary and external keys if
any. The names of the tables in a database are
unique. For each table it will be created a new file
with a specific structure (a header area and a
recording area). This file is created in the database
folder having the name of the table and the .tbl
extension. The user has to specify the structure of
the table: the columns, data type, size, applicable
constrains. The name of a column also has to be
unique. This aspect is ensured by DBMS.
Three types of data are implemented: int, char
and image. For the fixed length char strings the user
specifies the maximum size. The database kernel
introduces a new type of data – Image. It allows
storing in database an image having one of the
following formats: bmp, gif or jpg.
When creating a new table, the user may also
specify the primary key. It can include one or
several columns.
The user may specify a 1:m connection between
two tables: a parent table (on the 1 side of the
connection) and a son (on the m side of the
connection). For this it must be used the Foreign
2
,
v)}))(u,
f,
Mv){P(u,
1
FFT((
f,
C
ϕϕ
=
yx
A NEW SOFTWARE TOOL FOR MANAGING AND QUERYING THE PERSONAL MEDICAL DIGITAL IMAGERY
201
Keys tag, in the same window. The user can easily
select the parent and son table and the foreign key
(figure 1). The primary key and the foreign key must
have the same type and the same size. If there is a
connection between two primary keys, the
connection will be 1:1. The structure of the table
might be seen at any moment in the main window of
the DBMS, using the Components tag.
Once the table is created, we can add new
records, modify or delete existing ones, using the
record editor.
5 UPDATING TABLE DATA
The user can add a new record only if the previous
record was correctly added and saved in the
corresponding file of the table. A record is correct if
all the fields are filled with information having the
type described in the structure. If one of the fields of
the table has the type image, when inserting process
is started, the “Choose the image” dialog window is
opened to permit the user to choose the image he
wants to add.
Figure 1: The window for establishing the relationship
between tables.
At this moment the algorithms for pre-processing the
image will be called:
Transforming from RGB space to HSV space
166 color quantification
12 values texture extraction using Gabor filters
The two characteristics vectors and the image are
then binary stored.
As we may see, the main window of the application
contains two parts: on the left the tree containing the
database structure is listed, and on the right is listed
the table structure (Components tag) or data
information (Data tag). An element of originality is
that when seeing the records in the database, the
image is directly viewed (figure 2).
Figure 2: The data in the selected table showed by the
Data tag.
Figure 3: The window that allows the building of the
content-based image query.
6 CONTENT-BASED VISUAL
QUERY
The database management system kernel offers the
possibility to build the content-based visual query, in
an easy manner, at the image level, using the menus
presented in the software window from figure 3. The
elements of this window are:
Similar With – opens the window for choosing
the query image
HEALTHINF 2009 - International Conference on Health Informatics
202
Select – permits to choose the field/fields that
will be presented in the results of the query
From – it is one of the tables in database, that
will be used for the query
Where – the image type column used for
content-based image query
Features – it is chosen the characteristic used
for content based visual query – color, texture
or a combination of them
Maximum images – specify the maximum
number of images returned by the query.
When building the query, it is actually built a
modified SQL Select command, adapted for content-
based image query. For example:
Select patients.diagnosis, patients.img
From Patients where Patients.img
Similar with Query Image (method:
color, max.images 5)
This modified Select command specifies that the
results are obtained from Patients table, taking into
consideration the values from diagnosis field, the
images similar with the query image for color
characteristic, and there will be 5 resulting images.
In the resulting set it is also presented the distance of
the dissimilarity between query image and target
image. In fact this modified command is very
suggestive for the users (medical personnel).
7 EXPERIMENTS
The database management system kernel was tested
from two points of view: the quality of the content-
based visual query process and the execution speed
in a private consulting room specialised in internal
medicine. For determination of the query process
quality, the experiments were performed in the
following conditions. It was created a database with
2036 colour images from the digestive area. The
images were taken from patients having the
following diagnoses: polyps, ulcer, esophagites,
colitis and ulcerous tumour. For each patient there
are more images of the same ill area, made from
different angles.
For each query, the relevant images have been
established. Each of the relevant images has become
in its turn a query image, and the final results for a
query are an average of these individual results
(table 1). These experiments have considered the
colour and texture attributes of the medical images,
each of them having equal weights (0.5). The
motivation of this choice is bound by the nature of
medical images from digestive area with different
diagnosis that generates changes both in colour and
texture of the ill tissue.
Table 1: The content-based visual query experimental
results.
Query
Nr. of relevant
images
N
r. of relevant
images retrieved
in the first 5
Polyps 1008 4
Colitis 288 3
Ulce
r
540 3
Ulcerous
Tumo
r
80 2
Esophagitis 120 3
In figure 4 there are two examples of content-
based image query for images categorized as colitis
(first column) and ulcer (the secondary column). In
both examples there are 4 relevant images in the first
5 retrieved images in the content-based visual query
process.
Irrelevant irrelevant
Figure 4: Some results of the content-based visual query
process.
A NEW SOFTWARE TOOL FOR MANAGING AND QUERYING THE PERSONAL MEDICAL DIGITAL IMAGERY
203
8 CONCLUSIONS AND FUTURE
WORK
In the paper there are presented the organization and
the functions of the implemented multimedia,
relational, single-user DBMS kernel. It is created for
managing and querying medium sized personal
digital collections that contain both alphanumerical
information and digital images (for examples the
ones used in private medical consulting rooms). The
software tool allows creating and deleting databases,
creating and deleting tables in databases, updating
data in tables and querying. The user can utilize
three types of data: int, char and image. There are
also implemented the two constraints used in
relational model: primary key and referential
integrity.
The software tool can execute both simple text
based query using one or several criteria connected
with logical operators (and, or, not) and content
based visual query at image level, taking into
consideration the color and texture characteristics.
These characteristics are automatically extracted
when the images are inserted in the database. The
visual manner of building this type of query specific
for multimedia data and the modified Select
command that is sent for execution to the DBMS
give originality to the software product. All the
functions of the software tool might be easily used
by persons that works in other domains not linked to
computers (for example medical domain that use
visual information). For implementation the Java
technology, which gives the platform more
independence was used.
REFERENCES
Chigrik, A., 2007. SQL Server 2000 vs Oracle 9i
http://www.mssqlcity.com/Articles/Compare/sql_server_v
s_oracle.htm
Del Bimbo, A., 2001. Visual Information Retrieval,
Morgan Kaufmann Publishers. San Francisco USA
Faloutsos, C., 2005. Searching Multimedia Databases by
Content. Springer
Gevers, T., Smeulders, W.M., 1999. Color-based object
recognition. Pattern Recognition. 32, 453-464
Gevers, T., 2001. Color in Image Search Engines.
Principles of Visual Information Retrieval. Springer-
Verlag, London
Gevers, T., 2004. Image Search Engines: An Overview.
Emerging Topics in Computer Vision. Prentice Hall
Kalipsiz, O., 2000. Multimedia databases. In: Proceedings
of the IEEE International Conference on Information
Visualization
Khoshafian, S., Baker, A.B., 1996. Multimedia and
Imaging Databases. Morgan Kaufmann Publishers,
Inc. San Francisco California
Kratochvil, M., 2005. The Move to Store Images In the
Database
http://www.oracle.com/technology/products/intermedia/pd
f/why_images_in_database.pdf
Muller, H., Michoux, N., Bandon, D., Geissbuhler, A.,
2004. A Review of Content_based Image Retrieval
Systems in Medical Application – Clinical Benefits
and Future Directions. Int J Med Inform. 73
Muller, H., Rosset, A., Garcia, A., Vallee, J.P.,
Geissbuhler, A., 2005. Benefits of Content-based
Visual Data Access in Radiology. Radio Graphics. 25,
849-858
Oracle, 2005. Oracle InterMedia: Managing Multimedia
Content
http://www.oracle.com/technology/products/intermedia/pd
f/10gr2_collateral/imedia_twp_10gr2.pdf
Palm, C., Keysers, D., Lehmann, T., Spitzer, K., 2000.
Gabor Filtering of Complex Hue/Saturation Images
For Color Texture Classification. In: Proc. 5
th
Joint
Conference on Onformation Science (JCIS2000) 2.
Atlantic City, USA 45-49
Smith, J.R., 1997. Integrated Spatial and Feature Image
Systems: Retrieval, Compression and Analysis. Ph.D.
thesis, Graduate School of Arts and Sciences.
Columbia University
Stanescu, L., Burdescu, D.D., Ion, A., Brezovan, M.,
2006. Content-Based Image Query on Color Feature in
the Image Databases Obtained from DICOM Files. In:
International Multi-Conference on Computing in the
Global Information Technology. Bucharest. Romania
HEALTHINF 2009 - International Conference on Health Informatics
204