XETA: EXTENSIBLE METADATA SYSTEM
For Extensibility, Accuracy, Suitability and Convenience
Yeojin Kim, SukBong Lee, SungJun Lee and SangGyoo Sim
Software Laboratoreis, Corporate Technology Operations, Samsung Electronics Co., LTD
416, Maetan-3Dong, Yeongtong-Gu, Suwon-City, 443-742, Republic of Korea
Keywords: Metadata, ontology, semantic web, adaptive user interface.
Abstract: This paper presents an eXtensible mETAdata system (XETA system) which makes it possible for the end-
user to organize and extend the structure of metadata. We discuss four requirements of the flexible metadata
system in semantic web and a methodology to implement the requirements. Using the XETA system, the
end-user can flexibly extend metadata, enhance its semantic accuracy and selectively apply the metadata in
context. The main purpose of XETA system provides end-users with a way to construct metadata in bottom-
up, not force them to accept fixed form and fixed meanings for metadata.
1 MOTIVATIONS
Metadata are data about data to describe resources in
purposes of identification, discovery, assessment,
management and so on. The emergence of semantic
web has expected more powerful metadata systems.
There are many related works for metadata
system. Traditionally, fix structured metadata allows
the user to annotate with plain text according to a
template structure, such as Dublin Core (DMCI,
2007). This approach has an advantage to provide
accurate and systemic information for resources, but
it has disadvantages not to allow the end-user to
freely organize the structure of metadata, and there
is no way to satisfy all area and all users, even
though it has some extensibility.
Tag-based annotation method allows the end-
user to freely arrange keywords to describe
resources. This approach has unlimited extensibility
but not systemic structure. Therefore, it has a limit to
recognize the relationship of keywords and to
understand a keyword in semantic. To solve the
problem, a concept model was proposed to
implement semantic of tags for Tag Ontology (Yang
and Ishizuka, 2007). Ontology-based semantic
annotation is one of the major techniques for putting
machine understandable data, which are
semantically interlinked metadata and there were
some discussions about the requirements one has to
meet when developing a component-based,
ontology-driven annotation framework (Handschuh,
Staab and Maedche, 2001). These approaches
provide a method to enhance semantic accuracy of
metadata but not a method to efficiently extend
metadata for end-users. However, they might be
used to interlink tags as a backend process of our
proposed extensible metadata model.
The Resource Description Framework (RDF) is a
language for representing information about
resources, particularly intended for representing
metadata about Web resources (Hayes, 2004). RDF
precisely identifies the relationships that exist
between the linked items, however, it does not
inform the end-user of a way to organize and extend
metadata intuitively.
Ontology is a formal, explicit specification of a
shared conceptualization of a domain of interest
(Gruber, 1993). The concept and realization of
ontology has been more important in semantic web.
The concept of ontology includes an upper level
ontology and a number of domain ontology. In
practice, it is very difficult to construct the upper
level ontology across all domains. Though possible
to construct it, it is expensive and concerned about
whether general end-users conveniently use many
concepts defined in the ontology. The SUMO
ontology (Niles and Pease, 2001) contains almost
1000 concepts and most of them are unintuitive,
which makes them unusable for browsing. The
fundamental problem is the approach to ontology
which forces end-users to accept the fixed meanings
system by top-down. Terminology is like an
143
Kim Y., Lee S., Lee S. and Sim S. (2008).
XETA: EXTENSIBLE METADATA SYSTEM - For Extensibility, Accuracy, Suitability and Convenience.
In Proceedings of the Fourth International Conference on Web Information Systems and Technologies, pages 143-148
DOI: 10.5220/0001518101430148
Copyright
c
SciTePress
organism, which newly come into being, change and
disappear. Fixed ontology is a closed space not to
evolve by itself, which cannot allow end-users to
participate in organizing the meanings system based
on their consensus.
2 REQUIREMENTS
As more easily generating and acquiring enormous
amount of contents, the end-user has had to manage
much more contents. Recently the end-users have
become “prosumers” so that they have assumed the
responsibility to provide not only contents but also
adequate metadata for their contents, because it
needs well-organized metadata for contents to
acquire attention among a large number of contents
in the web.
The proposed system focuses on organizing
metadata system by end-users in a local domain. It
reaches from an empty meanings space to a well-
organized metadata space with consensus based on
collaboration.
Based on above motivations, the requirements of
the proposed metadata system are the followings;
Extensibility: users can extend metadata at any
point of metadata individually or
collaboratively.
Accuracy: metadata should be interpreted as
users intend.
Suitability: users can determine whether they
open a part of metadata or not according to
their purposes.
Convenience: users can easily and reasonably
organize and extend metadata.
2.1 Extensibility
The fixed metadata system allows the end-user to
annotate with text according to a fixed template
structure, such as Dublin Core. This approach has a
bit of possibility to extend metadata but the end-user
cannot freely organize own structure of metadata.
The problem is that many areas of contents have
newly emerged and the areas have been also
subdivided in more detail. Even contents service
providers have had difficulty designing metadata
systems suitable for all contents areas. It takes high
cost to acquire the domain specific knowledge and
to satisfy diverse demands of users for metadata.
Moreover, at present the end-user has a greater limit
to describe metadata for his contents than the
commercial contents providers.
No one knows in advance which form of
metadata should be assumed in evolving areas. The
fields of contents have been broad and variable so
that the metadata system with fixed form has been
not inefficient. Therefore the metadata system for
Web2.0 generation should take an extensibility to
design the structure of metadata according to the
end-user’s immediate needs in broad and variable
fields. Such extensibility should be also available
not only by an individual but also by collaborative
intelligence.
2.2 Accuracy
In semantic web, one considerable to search proper
contents in accordance with the user’s intention is
that we should insert accurate metadata to contents
prior to develop semantic search engines. It needs
the method to grant “meanings” to metadata because
simply to arrange keywords or tags lacks
information in semantic point of view. Many unclear
explained contents often fail to arrest attention in the
web. Some keywords may be multivocals, or a series
of keywords may have ambiguous relationships
among them. For example, “cats” may be a kind of
pet or a title of musical. “Silver” may be a kind of
colour or a sort of metal. In case of combining two
words, “silver cat” may mean a living cat with silver
hair or a silver accessory with the shape of cat.
2.3 Suitability
Prosumers have both responsibility and rights to
open adequate metadata for their contents. Even
same contents need different metadata in different
context. For example, about a flower picture taken
by a digital camera, flower websites and digital
camera websites require different information.
People connecting to flower websites may want to
get the information about species, habitat, colour or
blooming season of the flower. In other hands,
people connecting to digital camera websites may
want to get the information about body model, lens,
functions or configurations of the digital camera. It
means that the user should write different metadata
whenever uploading contents to diverse websites.
One solution to this problem is that the user
configures metadata for the purpose. While sharing
contents, the configuration of metadata is maintained
and the metadata is self-filtered in context. For
implementing this function, each fields of metadata
should be possible to set an option for the specific
domain opening and be either individually handled
or freely combined.
WEBIST 2008 - International Conference on Web Information Systems and Technologies
144
2.4 Convenience
A well-designed metadata system should consider
the convenience of users. Although the proposed
metadata system makes it possible for an end-user to
extend metadata accurately and suitably, the
metadata system would be disregarded if it is
difficult for common users to use the system.
Therefore, it needs a user interface which guides
users to the right direction and helps them to
construct proper metadata more conveniently.
3 METADATA SYSTEM
This section describes the proposed metadata system
to meet the four requirements presented in section 2.
3.1 Architecture
Figure 1 illustrates the functional architecture of the
proposed extensible metadata system. A user device
includes the following components to support the
extensible metadata system:
Media Manager: managing metadata and
negotiating content policies with Policy
Negotiator
Media container: consisting of contents and
its extensible metadata.
Policy Negotiator: filtering information
according to users preferences
Policies Repository: storing policies of
content, device, privacy and so on
Communication Module: communicating
with external devices or networks
Figure 1: Functional Architecture of the Proposed
Metadata System.
Media Manager controls to generate and update
metadata. The generated or updated metadata is
stored with the corresponding contents in Media
Container. When the user transfers a media to
external devices or networks, Media Manager calls
Policy Negotiator to filter metadata. Policy
Negotiator makes a merged policy based on
comparison of the metadata configuration of content
and the content policy from Policies Repository.
According to the merged policy, Media Manger
transfers the media with appropriate metadata
through Communication Module to Contents Servers.
3.2 Tag Object
The proposed metadata system considers metadata
as a tag set. The tags should support the functions of
extensibility, accuracy and suitability. In the
proposed system, a tag is managed as an object, not
a character string, to implement such functional tags.
Tag Object is an object which contains attributes and
functions to implement meanings of metadata. Tag
Object classifies standard tag and extended tag.
Standard tags construct a basic metadata structure
defined by service providers or applications, which
are recommended to have general and minimal
categories. Or it means the Tag Object determined
by consensus of domain users. Extended tags
construct a user defined metadata structure, which
extends standard tags and makes them detail. It
means the Tag Object entered by the end-user to add
some meanings to the domain metadata system. Tag
Object has attributes such as view or element. View
tag represents view, category, purpose and role in
order to clear the concept and usage of a tag whereas
element tag represent information explaining
contents (see Table 1). View tag and element tag
implement ontological concept and instance.
We introduce a simple method to design Tag
Object but not give a full detail of that in this paper.
The Single Tag Object Method showed at Table 2 is
to implement a Tag Object containing a tag with
single meanings. This method has an advantage to
make multiple tags freely connected and reused. To
provide more powerful functions, it could combine
more than two Single Tag Objects. A Single Tag
Object contains the following fields: ID, attribute,
classification, function and reserved field. Function
field can define diverse functions to manage
metadata. For example, a function to connect Tag
Objects based on the relationship of tags can
compute and store rates between the Tag Object and
other Tag Objects.
Contents
Extensible
Metadata
Policies
Re
p
ositor
y
Policy
Negotiator
Communication
Module
User
Media Container
Site #1
Site #2
Site #N
Contents
Servers
Media Manager
XETA: EXTENSIBLE METADATA SYSTEM - For Extensibility, Accuracy, Suitability and Convenience
145
Table 1: Classification and Attributes of Tags.
Classification Attribute Description
View
View, category, purpose
and role to expatiate
element tag, provided by
applications
Standard
Tag Object
Element
Keywords to explain
contents, provided by
applications
View
View, category, purpose
and role to expatiate
element tag, defined by
users
Extended
Tag Object
Element
Keywords to explain
contents, defined by users
Table 2: Single Tag Object Method.
Component Description
ID character string
attribute [ view | element ]
classification [ standard | extension ]
function function for metadata management
reserved reserved for application
3.3 Collaborative Metadata
The proposed metadata system supports the
extensibility of metadata structure not only for an
individual user but also for collaborative user group.
Content servers include websites to which a user
device uploads contents. A content server has
policies related to contents and accepts only
appropriate contents through negotiation with the
user device. The uploaded contents may contain the
extended metadata what is called a user-defined
metadata. The content server can extract a metadata
structure from the user-defined metadata, and adopt
it as a candidate metadata structure for its specific
domain. The adopted metadata structure can be
reused by other users in the website. Also, another
user can make a more elaborate metadata structure
based on the candidate. That is, it is possible for the
metadata system to be efficiently enhanced by
collaboration in social networks. How to evaluate
and adopt the user defined metadata structure is out
of scope of this paper.
3.4 Example
This clause describes the functional flow of an
example to extend metadata. It assumes that John
has a multimedia device supporting the proposed
metadata system. John generates a movie about
animal cats and writes metadata of the movie. For
example, “authorship” and “content” fields could be
provided as the basic metadata structure. He enters
an element tag [John] in “authorship” field and an
element tag [cats] in the subject field of content
information. “Authorship,” “content” and “subject”
is classified to view tags which describe to which
categories [John] and [cats] belong.
John thinks that the standard element tag [cats]
does not sufficiently represent the subject of his
movie because it is possible for other users to
confuse his contents with the musical Cats (see
Figure 2). To clear the meanings of [cats], He
decides to extend metadata for [cats]. He can extend
metadata using two kinds of tags, i.e. element tag
and view tag. He enters an extended element tag
[animal] as additional information of [cats]. Figure 3
shows an example to extend metadata of his contents
about cats. Other users explicitly consider his movie
as contents related animal cats not the musical Cats.
However other users still do not know which kind of
cats means the [cats].
Figure 2: Ambiguity of Tag Cats.
Figure 3: Example of Extending Tag Objects about Cats.
John decides to provide “species” and “colour”
as more information about his cats. He enters an
Standard Tags Extended Tags
Metadata
Authorship [John]
Content Subject [cats]
www.animal.net
www.cats.net
target
[animal] species
[Persian]
colour [silver]
?
Musical “Cats”
Animal “Cats”
“Cats”
WEBIST 2008 - International Conference on Web Information Systems and Technologies
146
extended element tag [Persian] for the extended
view tag “species” and an extended element tag
[silver] for the extended view tag “colour”. At this
point, other users can imagine more concrete image
of the cats in the movie made by John.
John thinks that the detail information of cats
such as species and colour are meaningful only in
the specific domains like cat community or animal
website. He does not want to impose useless
information on other users, who do not need these
detail information of cats. So John configures
metadata to open the detail information only to two
target websites, “http://www.animal.net” and
“http://www.cats.net.” Users could get the species
and colour information of the cats only if they search
his movie from the two target websites, but they
could not in the other web sites.
In connection with privacy, John wants to
guarantee anonymous authorship about his contents
uploaded in public networks so that he configures
his content policy that the authorship of his contents
is hided when he transfers his contents to public
networks. Even though his personal information is
written in the metadata of his contents, the personal
information would be filtered according to the
content policy when the movie is transferred to any
public network.
3.5 Relational Tag Bridge
The proposed model provides the functionality for
extensibility, accuracy and suitability, whereas it
entrusts an end-user with the delicate work to realize
the ontology about metadata. It is difficult that the
end-user to construct well-organized metadata
structure if he is not an expert of the domain. To
enhance the end-user’s convenience, it should
support a user interface in a proper time
recommending appropriate tags used in the domain.
For this, we propose the concept of Relational Tag
Bridge which makes an end-user possible to easily
and reasonably construct and extend own metadata
structure. Relational Tag Bridge is a kind of graph
consisting of tags and rates among the tags. It is a
method to manage metadata database in domain
server and also an adaptive user interface to provide
accessibility to proper tags for end-users.
The key of Relational Tag Bridge is the rating
method to evaluate the relevance of tags. Rates are
flexibly determined and updated by users. Each rate
among tags is increased whenever a user enters a
series of tags for the contents. That is, the tags
simultaneously selected by many users could be
considered as the tags having meaningful
relationship in the domain so that such tags would
have high rates; the tags would be recommended to
other users if the rate is more than any critical value.
To prevent dominated tags from being everlasting,
the user may see the wider range of tags related the
focused tag by decreasing the critical value in his
local metadata system. Or he may check recent
candidates to apply more diverse tags related the
focused tag.
At the example of Figure 2, it is supposed that
John does not know which tags are appropriate for
his contents to provide rich metadata. John enters the
tag “cats” however he does not know that the tag
“cats” induces what kind of ambiguity. At this time,
the Relational Tag Bridge can inform the user of
possible categories related to the entered tag “cats”:
“pet,” “animal,” “species,” “musical,” and “CF” (see
Figure 4). He selects “animal” so that “species” and
“Chordate” are recommended as the detailed views.
If he selects “species,” he can obtain more
information about species of cat. Finally, he selects
“Persian” and terminates to input metadata for his
contents. Based on the selected tag history,
“Persian” means Persian cat but not Persian person;
“cats” has very low relevance with Persian person at
this time. At any point, John can determine whether
to extend his metadata or not.
An advantage of Relational Tag Bridge is that it
effectively guides the user to express what he wants
to explain. According to selection of the user, the
focused tag and its related tags are dynamically
changed and consist of a focused view in each point
of time. The focused view effectively delivers the
necessary amount of information to the user.
Relational Tag Bridge also has an effect on inflow
and spread of new knowledge in a domain. That is,
users can get new knowledge about their interests by
contacting new terminology or categories through
the focused view. Another advantage of Relational
Tag Bridge is to construct the structure of huge
amount metadata through collaboration based on
folksonomy. Such approach helps for contents or
service providers to reduce the cost of constructing
metadata structure in which all users’ requirements
are reflected in a specific domain.
Its limitation is that it costs a great deal to
maintain and manage rates among all tags. An
alternative for this problem is to restrict the number
of relative tags and to periodically update rates but
not real time. The method of rate update is out of
scope in this paper.
XETA: EXTENSIBLE METADATA SYSTEM - For Extensibility, Accuracy, Suitability and Convenience
147
Figure 4: Example of Relational Tag Bridge.
4 DISCUSSIONS
The main purpose of this paper tries to find a way to
organize a flexible and intuitive metadata system
which is not to design an ontology system
containing fixed concepts for a domain. We
considered the suggested four requirements:
extensibility, accuracy, suitability and convenience.
As showed in above example, (a) the user can freely
extend metadata using extended tags, (b) the
meanings of each tag becomes made clear using
view tags which explains the element tag in specific
view, (c) the user has a right to control metadata by
adopting or rejecting the metadata in context and (d)
the user can conveniently reuse common tags refined
by collective intelligence in the domain.
As the user organizes metadata, the organized
tags could be interlinked using the attributes and
functions of Tag Object. This concept is different
from the models to automatically extract and
interlink data from the resource, which generally has
low quality. The proposed model helps the user to
interlink metadata in semantic. The method to model
Tag Objects, i.e. the Single Tag Object Method, is a
kind of backend process to efficiently manage the
extended metadata. Such backend process might be
substituted several ontology models mentioned in
introduction.
The link of Tag Objects is not static, but dynamic
and statistical in semantic; it is a bottom-up and
folksonomy approach by end-users. The different
point of view tag from the existing ontological
category is that view tags freely connect or
disconnect each other, and must not be necessary for
every element tags. Tag Objects are more intuitively
connected according to user’s needs rather than
systemically. The metadata system by an individual
might be vulnerable but the collaborative metadata
system could evolve to the most necessary form for
the domain.
5 CONCLUSIONS
We presented a concept model of extensible
metadata system, the XETA system in this paper.
The XETA system can implement intuitive and
flexible metadata sturcture by making and
connecting Tag Objects. It provides a method for
end-users to easily extend metadata using Relational
Tag Bridge. We hope a new approach to construct
metadata system by bottom-up in the web.
The future works for the XETA system are the
followings:
How to design effective Tag Object
How to evaluate quality of metadata structure
defined by users
How to manage candidate metadata structure
in side of contents servers.
REFERENCES
DCMI, (2007) Dublin Core Metadata Initiative. [Online],
Available: http://dublincore.org/,
Hayes, P. (2004) RDF Semantics. W3C Technical report,
W3C, W3C Proposed Recommendation., [Online],
Available: http://www.w3.org/TR/2004/REC-rdf-mt-
20040210/
Yang J., Ishizuka M. (2007) ‘Social Graphic Tagging for
Semantic Metadata and a Case Study on Consensus
Discovery’ IJCAI-07 [CD-ROM], Workshop on
Semantic Web for Collaborative Knowledge
Acquisition, Hyderabad, India, p. 7.
Handschuh, S., Staab, S. and Maedche, A. (2001)
‘CREAM – Creating relational metadata with a
component-based, ontology-driven annotation
framework’, In Conference Proceedings, International
Conference on Knowledge Capture, Vancouver.
Gruber, T. R. (1993) ‘A Translation Approach to Portable
Ontology Specifications’, Knowledge Acquisition, vol.
6 no. 2, June, pp. 199-221.
Niles, I. and Pease, A. (2001) ‘Twards a Standard upper
Ontology’, In Proceedings of the 2
nd
Interanional
Conference on Formal Ontology in Information
Systems (FOIS-2001), Chris Welty and Barry Smith,
eds, Ogunquit, Maine.
Naphade M. R. and Huang, T. S. (2000), ‘Semantic video
indexing using a probabilistic framework’, IAPR
International Conference on Pattern Recognition,
Barcelona, vol. 3, pp. 83-88.
pet
CF
movie
food
cats
musical
Broadway review ticket
species
animal
Feline
Chordate Mammals
Carnivore
Abyssinian
Persian
character
ori
g
in
Siamese
0.4
0.2
0.15
0.3
0.2
0.2
0.1
0.01
0.16
0.09
0.02
0.11
0.3
0.27
rate 0.1
rate < 0.1
Focused view
WEBIST 2008 - International Conference on Web Information Systems and Technologies
148