Progressive Semiotic Enrichment
Designing Learning Content Metadata for Web 3.0
Lars Johnsen and Jens Jørgen Hansen
Department of Design and Communication, University of Southern Denmark, Engstien 1, Kolding, DK-6000, Denmark
Keywords: Metadata, Content Development, Web 3.0, Semiotics, Semantics, Design, Progressive Enhancement.
Abstract: Web 3.0 allows learning content to be semantically annotated thus facilitating improved information
retrieval, reuse and integration. This short paper presents a design pattern for progressively describing and
annotating learning content components in Web 3.0 based on key concepts adopted from social semiotics.
Furthermore, the paper exemplifies how this design pattern may be encoded using a structured data format
such as RDFa Lite and a general-purpose vocabulary like schema.org. Finally, some potential benefits of
this approach are briefly touched upon.
1 INTRODUCTION
As structured data formats such as Microdata and
RDFa (Lite) and vocabularies like schema.org,
ALOCOM, LRMI and SKOS are maturing and
becoming widely available, learning content on the
Web can now be described and annotated in greater
detail thus facilitating increasingly sophisticated
automatic information processing (retrieval,
rendering, reuse and integration). In theory,
embedded learning content objects like images,
diagrams and videos can be unbundled and reused
across disparate materials and contexts; content can
be shown in different modes and forms and data sets
about a specific concept, topic or event can be
aggregated from different sources.
This potential is closely related to the
introduction and application of HTML5, the latest
version of the popular format, or markup language to
be more precise, used in most existing web pages,
including web-based learning materials. Besides
enhanced functionality for handling multimedia,
interactivity and content organization, HTML5
affords a set of mechanisms for embedding
structured data to describe the semantic contents of
web documents: what they are about, what their
communicative purpose is, etc.
Mature and easy-to-use standards and
technologies, of course, are not enough. To create
reusable learning content and to build working e-
learning solutions in Web 3.0, sound and viable
design approaches need to be devised, implemented
and tested. For instance, content developers need
design patterns for structuring and semantically
annotating content in ways that are meaningful,
transparent, consistent and scalable.
In this short paper, a design pattern for
describing and annotating embedded learning
content components in Web 3.0 is proposed. A
design pattern may be defined as a general solution
to a recurring design problem.
The design pattern presented here may be said to
be theory-driven in the sense that it draws on key
concepts adopted from social semiotics, a theoretical
framework for analyzing meaning and meaning
making in multimodal materials. But it is also
practice-oriented insofar as it may be applied as an
integral part of a concrete web design and
development methodology such as progressive
enhancement (see below). The design pattern itself
allows content developers to enrich learning objects
with inline metadata detailing their semiotic
characteristics, notably their medium, representation,
genre and metafunctions. By employing such a
design pattern, content developers may explicitly,
and in a standardized manner, link documents to
domains, or more abstractly, resources to reality.
In addition, the paper exemplifies how this
design pattern may be encoded using a structured
data format such as RDFa Lite and a general-
purpose vocabulary like schema.org. Finally, some
potential benefits of this approach are briefly hinted
at.
172
Johnsen L. and Hansen J..
Progressive Semiotic Enrichment - Designing Learning Content Metadata for Web 3.0.
DOI: 10.5220/0004403801720176
In Proceedings of the 5th International Conference on Computer Supported Education (CSEDU-2013), pages 172-176
ISBN: 978-989-8565-53-2
Copyright
c
2013 SCITEPRESS (Science and Technology Publications, Lda.)
2 PROGRESSIVE
ENHANCEMENT
These days, much web design and development (for
learning) is based on the notion of progressive
enhancement. The idea is that basic content and
functionality should be available to all users
irrespective of which browser (version) they are
using, what hardware platform they employ or what
assistive technology they may need (e.g. screen
readers). In a word, basic content and functionality
should be accessible. Content and functionality can
then be progressively enhanced through presentation
(typography and layout) or interactivity so that those
having the newest browser (version) get a richer or
an aesthetically more pleasing web experience (see
for instance Allsopp, 2009 and Gustafson, 2011).
The ideal of progressive enhancement can be
built directly into a HTML5 web document as
shown in figure 1.
Figure 1: Progressive enhancement.
The core of the document is the content: text, images
and multimedia objects. This content is organized
into a transparent formal structure typically
consisting of sections, subsections, headings,
paragraphs and so on. These elements may then be
encoded to signal their semantics. Further
information can be added to specify what the
presentation of the document parts is going to look
like, while interaction elements finally provide a
way for users to control and navigate the document.
What is important here is the layered nature of the
model: outer layers have inner layers within their
scope. Interaction can be applied to presentation
(e.g. a button for changing color or font-size), and
semantics to structure (e.g. markup to signal that a
certain paragraph is a procedure or a summary).
The construction of structure, semantics,
presentation and interaction is normally done using
technologies like HTML tags (structure), structured
data formats like Microformats, Microdata or RDFa
(semantics), CSS (presentation) and JavaScript
(interaction) as shown in figure 2.
Figure 2: Technologies for progressive enhancement.
2.1 Semantic Enrichment
As for the semantic, or descriptive, layer, the key
question is not only what semantic characteristics to
capture and annotate, and at what level of detail, but
also what model to adopt.
We propose a content metadata design pattern,
which is loosely grounded in semiotic theory, more
specifically social semiotics (see for instance Kress
and Van Leeuwen, 2001; Van Leeuwen, 2005 and
Bezemer and Kress, 2008). In social semiotics, the
focus is on meaning and meaning making in
multimodal contexts. From a social semiotic
perspective, meaning is constructed through
semiotic resources, interacting complexes of signs,
in a variety of modes (writing, images, layout,
gesture, etc.) and distributed through different media
both electronic and physical. Signs are shaped
through genres and can perform different
communicative functions: they can construe reality
(ideational meaning), create communicative
coherence (textual meaning) or relate
speakers/writers to addressees (interpersonal
meaning).
The model proposed here presents a web-based
learning object as a semiotic resource having (at
least) four central facets: medium, representation,
genre and metafunctions. The model is depicted in
figure 3:
Medium may be perceived as the channel, or
frame, through which the content is communicated
or distributed. So, a medium may be a blog posting,
a wiki page or a social medium like FaceBook.
Representation refers to an objects multimodal
representation. Is it a written text, an image, a video
object or a combination or modes?
ProgressiveSemioticEnrichment-DesigningLearningContentMetadataforWeb3.0
173
Figure 3: Learning objects as semiotic resources.
A genre is a goal-oriented semantic configuration or
template “for getting things done” communicatively
in a certain culture (see Martin and Rose, 2008 and
Vorvilas et al., 2011). Examples of genres are
stories, procedures and reports (broadly descriptions
of entities). Such genres may combine to form
macrogenres, typical of larger units of educational
material (e.g. textbooks).
Metafunctions are the semantic elements that can
be identified in an object. Ideational meaning is
what the object is about (persons, places, events and
concepts), textual meaning is what makes the object
(more or less) cohesive and coherent, internally and
externally, and interpersonal meaning is meaning
associated with the relationship between the creator
or designer of the object and his or her audience.
The upper part of the learning object model in
figure 3 has mainly to do with the form of the
resource (signifier) while the lower part relates to its
content or meaning (signified). The arrows indicate
the "natural progression" in analysis from a concrete
communication channel to more or less elusive
meanings.
The four semiotic categories may themselves be
progressively refined. For instance, a learning
content component might not only be categorized as
an image but also as a diagram or even a concept
map. And a specific instance of a genre like
procedure could be subtyped as a food recipe or an
installation guide.
So, a learning content object might be described
along the following lines:
Medium:Blogposting>tweet
Representation:Image>diagram>conceptmap
Genre:Report>classifyingreport
Metafunction:Ideationalmeaning>concepts
(birds>birdsofprey>eagles,hawks,vultures)
That is to say, this bundle of metadata designates
a concept map published in a tweet on Twitter and
giving a classificational description of a set of
concepts, namely birds of prey.
Another object might have these characteristics:
Medium:Wikipage
Representation:Writing
Genre:History>historicalrecount
Metafunction:Ideationalmeaning>events(Battle
oftheLittleBighorn),persons(Custer,Crazy
Horse),places(LittleBighornRiver);textual
meaning>externallink>elaboration
Here a piece of writing in a wiki page gives a
historical recount of the Battle of the Little Bighorn
involving certain persons and places and containing
a link to an external resource providing additional
information on the event.
In other words, this semiotic descriptive model
may be conceived of as a kind of extensible facetted
classification in which any learning component may
belong to several types and subtypes or be viewed
from several perspectives.
Now, further information may be attached to
these four main facets. For instance, didactical
metadata may be added to the learning object as a
representation to indicate degree of interactivity,
learning style, etc. Ideational meanings may be
expanded: events may be located in place and time,
persons may be described or depicted and places
may be encoded with geospatial coordinates.
Also, the metadata may describe nested
structures, as it were. For instance, in a concept map
classifying birds of prey, there might be images
attached to the various types of bird. These images
may themselves be subjected to semantic description
and annotation.
Semiotic metadata should not be seen as an
alternative to more traditional metadata schemes
such as Dublin Core, LOM or SCORM (e.g.
Robertson, 2011) but rather as an extension. They
make it possible, it is believed, to tie a learning
resource more closely to its domain and to specify in
greater detail what communicative intentions it
seeks to realize. Moreover, as argued below, they
may also have a role to play as part of the actual
learning design of the resource in which they are
embedded.
2.2 Encoding
It is beyond the scope of this short paper to discuss
the many possibilities of representing semiotic
metadata as embedded semantic markup in HTML5
documents using structured data formats such as
Microdata and RDFa (Lite). The following example,
however, suggests one simple approach. It makes
CSEDU2013-5thInternationalConferenceonComputerSupportedEducation
174
use of RDFa Lite syntax and the schema.org
vocabulary:
<!DOCTYPE HTML>
<html>
<body vocab="http://schema.org">
<article typeof="Article
CreativeWork/Medium/Wikipage
CreativeWork/Representation/Writing
CreativeWork/Genre/Report/DescriptiveRe
port">
<div
property="about/metafunction/ideational
Meaning" typeof="Person/General">
<h1 property="name">General Custer</h1>
<p>George Armstrong Custer(December 5,
1839 – June 25, 1876) was a United
States Army officer and cavalry
commander in the <span
property="performerIn"
typeof="Event"><span
property="name">American Civil
War</span></span> and the Indian Wars.
Raised in Michigan and Ohio, Custer was
admitted to West Point in 1858, where
he graduated last in his class.
However, with the outbreak of the Civil
War, all potential officers were
needed, and Custer was called to serve
with the Union Army.</p>
<img property="image"
src="http://upload.wikimedia.org/wikipe
dia/commons/thumb/8/83/Custer_Portrait_
Restored.jpg/250px-
Custer_Portrait_Restored.jpg" />
</div>
</article>
</body>
</html>
This snippet of text copied from Wikipedia is
marked up with common HTML5 elements such as
<body>, <article>, <p>, <h1> and <img> to indicate
structural components like paragraph, heading and
image. These elements contain semantic markup
based on schema.org classes and properties. They
formally state that this is an article about a person
whose name is Custer who was a performer in an
event whose name was the American civil war and
that it is indeed his image that is included in the text.
The schema.org vocabulary allows content
developers to extend and customize its categories
and properties. For instance, in the representation
above it is specified that Custer was not only a
person but in fact also a general
(
typeof="Person/General").
And it is this extension mechanism we have
employed to include semiotic metadata. To indicate
the medium, representation and genre of the
resource we have extended, or specialized, the
schema.org category of Creative Work and to signal
ideational meanings in this text, we have extended
the about property. (Search engines like Google's do
not know about these extended notions of general,
medium and ideational meaning and so on, of
course, but they will still be able to act on the core
categories and properties like person and about).
In figure 4 a semantic/semiotic representation of
the resource is shown when fed to an RDFa
processor (http://rdfa.info/play).
Figure 4: Semiotic representation of sample text.
For the sake of simplicity, only one vocabulary is
instantiated in the Custer example. But structured
data formats like Microdata and RDFa (Lite)
actually allow several schemas to be mixed in an
HTML5 document. For instance, in a longer
document we might have included categories and
properties from the FOAF vocabulary (“Friend of a
friend”) to specify General Custer’s relations to
other historical persons or referred to the Dublin
Core vocabulary to add details about the document
as a digital resource. And if the text had contained
any hypertext links, we might have pointed to
classes in the Salt Rhetorical Ontology to specify
their communicative function.
3 BENEFITS AND WIDER
PERSPECTIVES
What, then, are the benefits of marking up semiotic
resources, complexes of meaningful signs, in
learning materials? It may be argued that gains may
be achieved in the following areas:
Firstly, the quality of information retrieval will
no doubt be enhanced. It will eventually be possible
ProgressiveSemioticEnrichment-DesigningLearningContentMetadataforWeb3.0
175
to formulate more precise queries in search engines.
Queries like "Find a concept map in English
classifiying birds of prey and containing pictures of
them" are no longer but a futuristic dream. (Google's
Knowledge Graph is evidence of this trend).
Secondly, linking content components across
web sites may be done in more explicit and
principled ways. A hyperlink may denote an
ideational relation between two domain entities
(person A is the brother of person B) or a textual, or
communicative, relationship (paragraph A is a
summary of section B). And if global identiers are
used, this is effectively tantamount to exposing, or
attaching, learning objects to the Web of Data using
linked data (see Heath and Bizer, 2011).
Thirdly, the reuse of embedded content
components is also likely to become easier as these
will be "unbundled" to a greater extent.
Fourthly, embedded semiotic annotation may
provide an additional affordance, which has to do
with the learning potential of learning objects, rather
than just their retrieval, linking or reuse. The reason
is that such metadata may be construed, and utilized,
as what we may call semiotic enzymes, hidden
elements enabling learning designs to be
(dynamically) altered in various ways to cater for
different user preferences, learning styles, rendering
devices, etc. (see Johnsen, 2012). As an example,
inline semiotic markup may be used to actively
support one or more of Mayer's principles of
multimedia learning (Mayer, 2009), in particular his
"principle of signalling", i.e. the guideline
advocating the use of conceptual structure markers
in learning materials. Embedded semiotic tags could
be used, say, as source data for dynamically creating
graphic organizers, spatial arrangements intended to
visually map the conceptual or narrative structure of
a piece of text and hence facilitate its comprehension
(see Stull and Mayer, 2007).
And since semiotic encoding can be done using
standards like Microdata and RDFa (Lite), reusable
style sheets, templates or widgets processing these
semiotic metadata can be developed and shared on a
global scale, especially for widely used categories
like events, persons and places. For example, a
college professor publishing a history textbook on
the web might link the document to an external
widget creating a visual timeline based on the events
mentioned in the text. Or a learner might download a
browser plug-in to flag all occurrences of concepts
of interest when surfing the web.
This affordance opens a whole set of
opportunities that could, for lack of a better term, be
called "Learning Content Design as a Service"
(LCDaaS). The idea itself is simple: content
providers like professors and teachers will only have
to concentrate on constructing structured materials
("basic content") but will be able to link these
materials to (dynamic) designs and in this way
create richer and more engaging learning resources.
And users will have a greater say in deciding what
design options they want for the materials they study
(visual support, interactivity, etc.).
REFERENCES
Allsopp, J., 2009. Developing with Web Standards. Pearson
Education.
Bezemer, J. & Kress, G., 2008. Writing in Multimodal
Texts: A Social Semiotic Account of Designs for
Learning. Written Communication 2008 25: 166
Gustafson, A., 2011. Adaptive Web Design: Crafting Rich
Experiences with Progressive Enhancement.
Chattanooga, Tennessee: Easy Readers
Heath, T. & Bizer, C., 2011. Linked Data: Evolving the
Web into a Global Data Space (1st edition). Synthesis
Lectures on the Semantic Web: Theory and Technology,
1:1, 1-136. Morgan & Claypool.
Johnsen, L., 2012. HTML5, Microdata and Schema.org:
Towards an Educational Social-Semantic Web for the
Rest of Us? In Markus, H. & Martins, M. J. & Cordeiro,
J. (eds.): Proceedings of the 4th International
Conference on Computer Supported Education.
SciTePress - Science and Technology Publications.
Kress, G. & Van Leeuven, T., 2001. Multimodal Discourse.
The modes and media of contemporary
Communication. London: Arnold.
Martin, J. R. & Rose, D., 2008. Genre Relations: Mapping
Culture. Equinox Publishing ltd.
Mayer, R. E., 2009. Multimedia Learning. Cambridge
University Press. 2nd edition.
Ronallo, J., 2012. HTML5 Microdata and Schema.org.
Code{4}Lib Journal. http://journal.code4lib.org/
articles/6400
Robertson, R. J., 2011. Technical standards in education,
Part 5: Take advantage of metadata. IBM
DeveloperWorks. http://www.ibm.com/developer
works/industry/library/ind-edustand5/index.html?
cmp=dw&cpb=dwind&ct=dwnew&cr=dwnen&ccy=zz
&csr=031011.
Stull, A. T. & Mayer, R. E., 2007. Learning by Doing
Versus Learning by Viewing: Three Experimental
Comparisons of Learner-Generated Versus Author-
Provided Graphic Organizers. Journal of Educational
Psychology, 2007, Vol. 99, No. 4, 808–820.
Van Leeuwen, T., 2005. Introducing Social Semiotics.
London: Routledge
Vorvilas, G. & Thanassis, K. & Ravanis, K., 2011. A
Genre-based Framework for Constructing Content for
Learning Objects. Asian Journal of Computer Science
and Information Technology 1:1.
CSEDU2013-5thInternationalConferenceonComputerSupportedEducation
176