Progressive Semiotic Enrichment

Designing Learning Content Metadata for Web 3.0

Lars Johnsen and Jens Jørgen Hansen

Department of Design and Communication, University of Southern Denmark, Engstien 1, Kolding, DK-6000, Denmark

Keywords: Metadata, Content Development, Web 3.0, Semiotics, Semantics, Design, Progressive Enhancement.

Abstract: Web 3.0 allows learning content to be semantically annotated thus facilitating improved information

retrieval, reuse and integration. This short paper presents a design pattern for progressively describing and

annotating learning content components in Web 3.0 based on key concepts adopted from social semiotics.

Furthermore, the paper exemplifies how this design pattern may be encoded using a structured data format

such as RDFa Lite and a general-purpose vocabulary like schema.org. Finally, some potential benefits of

this approach are briefly touched upon.

1 INTRODUCTION

As structured data formats such as Microdata and

RDFa (Lite) and vocabularies like schema.org,

ALOCOM, LRMI and SKOS are maturing and

becoming widely available, learning content on the

Web can now be described and annotated in greater

detail thus facilitating increasingly sophisticated

automatic information processing (retrieval,

rendering, reuse and integration). In theory,

embedded learning content objects like images,

diagrams and videos can be unbundled and reused

across disparate materials and contexts; content can

be shown in different modes and forms and data sets

about a specific concept, topic or event can be

aggregated from different sources.

This potential is closely related to the

introduction and application of HTML5, the latest

version of the popular format, or markup language to

be more precise, used in most existing web pages,

including web-based learning materials. Besides

enhanced functionality for handling multimedia,

interactivity and content organization, HTML5

affords a set of mechanisms for embedding

structured data to describe the semantic contents of

web documents: what they are about, what their

communicative purpose is, etc.

Mature and easy-to-use standards and

technologies, of course, are not enough. To create

reusable learning content and to build working e-

learning solutions in Web 3.0, sound and viable

design approaches need to be devised, implemented

and tested. For instance, content developers need

design patterns for structuring and semantically

annotating content in ways that are meaningful,

transparent, consistent and scalable.

In this short paper, a design pattern for

describing and annotating embedded learning

content components in Web 3.0 is proposed. A

design pattern may be defined as a general solution

to a recurring design problem.

The design pattern presented here may be said to

be theory-driven in the sense that it draws on key

concepts adopted from social semiotics, a theoretical

framework for analyzing meaning and meaning

making in multimodal materials. But it is also

practice-oriented insofar as it may be applied as an

integral part of a concrete web design and

development methodology such as progressive

enhancement (see below). The design pattern itself

allows content developers to enrich learning objects

with inline metadata detailing their semiotic

characteristics, notably their medium, representation,

genre and metafunctions. By employing such a

design pattern, content developers may explicitly,

and in a standardized manner, link documents to

domains, or more abstractly, resources to reality.

In addition, the paper exemplifies how this

design pattern may be encoded using a structured

data format such as RDFa Lite and a general-

purpose vocabulary like schema.org. Finally, some

potential benefits of this approach are briefly hinted

at.

172

Johnsen L. and Hansen J..

Progressive Semiotic Enrichment - Designing Learning Content Metadata for Web 3.0.

DOI: 10.5220/0004403801720176

In Proceedings of the 5th International Conference on Computer Supported Education (CSEDU-2013), pages 172-176

ISBN: 978-989-8565-53-2

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

2 PROGRESSIVE

ENHANCEMENT

These days, much web design and development (for

learning) is based on the notion of progressive

enhancement. The idea is that basic content and

functionality should be available to all users

irrespective of which browser (version) they are

using, what hardware platform they employ or what

assistive technology they may need (e.g. screen

readers). In a word, basic content and functionality

should be accessible. Content and functionality can

then be progressively enhanced through presentation

(typography and layout) or interactivity so that those

having the newest browser (version) get a richer or

an aesthetically more pleasing web experience (see

for instance Allsopp, 2009 and Gustafson, 2011).

The ideal of progressive enhancement can be

built directly into a HTML5 web document as

shown in figure 1.

Figure 1: Progressive enhancement.

The core of the document is the content: text, images

and multimedia objects. This content is organized

into a transparent formal structure typically

consisting of sections, subsections, headings,

paragraphs and so on. These elements may then be

encoded to signal their semantics. Further

information can be added to specify what the

presentation of the document parts is going to look

like, while interaction elements finally provide a

way for users to control and navigate the document.

What is important here is the layered nature of the

model: outer layers have inner layers within their

scope. Interaction can be applied to presentation

(e.g. a button for changing color or font-size), and

semantics to structure (e.g. markup to signal that a

certain paragraph is a procedure or a summary).

The construction of structure, semantics,

presentation and interaction is normally done using

technologies like HTML tags (structure), structured

data formats like Microformats, Microdata or RDFa

(semantics), CSS (presentation) and JavaScript

(interaction) as shown in figure 2.

Figure 2: Technologies for progressive enhancement.

2.1 Semantic Enrichment

As for the semantic, or descriptive, layer, the key

question is not only what semantic characteristics to

capture and annotate, and at what level of detail, but

also what model to adopt.

We propose a content metadata design pattern,

which is loosely grounded in semiotic theory, more

specifically social semiotics (see for instance Kress

and Van Leeuwen, 2001; Van Leeuwen, 2005 and

Bezemer and Kress, 2008). In social semiotics, the

focus is on meaning and meaning making in

multimodal contexts. From a social semiotic

perspective, meaning is constructed through

semiotic resources, interacting complexes of signs,

in a variety of modes (writing, images, layout,

gesture, etc.) and distributed through different media

both electronic and physical. Signs are shaped

through genres and can perform different

communicative functions: they can construe reality

(ideational meaning), create communicative

coherence (textual meaning) or relate

speakers/writers to addressees (interpersonal

meaning).

The model proposed here presents a web-based

learning object as a semiotic resource having (at

least) four central facets: medium, representation,

genre and metafunctions. The model is depicted in

figure 3:

Medium may be perceived as the channel, or

frame, through which the content is communicated

or distributed. So, a medium may be a blog posting,

a wiki page or a social medium like FaceBook.

Representation refers to an object’s multimodal

representation. Is it a written text, an image, a video

object or a combination or modes?

ProgressiveSemioticEnrichment-DesigningLearningContentMetadataforWeb3.0

173

Figure 3: Learning objects as semiotic resources.

A genre is a goal-oriented semantic configuration or

template “for getting things done” communicatively

in a certain culture (see Martin and Rose, 2008 and

Vorvilas et al., 2011). Examples of genres are

stories, procedures and reports (broadly descriptions

of entities). Such genres may combine to form

macrogenres, typical of larger units of educational

material (e.g. textbooks).

Metafunctions are the semantic elements that can

be identified in an object. Ideational meaning is

what the object is about (persons, places, events and

concepts), textual meaning is what makes the object

(more or less) cohesive and coherent, internally and

externally, and interpersonal meaning is meaning

associated with the relationship between the creator

or designer of the object and his or her audience.

The upper part of the learning object model in

figure 3 has mainly to do with the form of the

resource (signifier) while the lower part relates to its

content or meaning (signified). The arrows indicate

the "natural progression" in analysis from a concrete

communication channel to more or less elusive

meanings.

The four semiotic categories may themselves be

progressively refined. For instance, a learning

content component might not only be categorized as

an image but also as a diagram or even a concept

map. And a specific instance of a genre like

procedure could be subtyped as a food recipe or an

installation guide.

So, a learning content object might be described

along the following lines:

 Medium:Blogposting>tweet

 Representation:Image>diagram>conceptmap

 Genre:Report>classifyingreport

 Metafunction:Ideationalmeaning>concepts

(birds>birdsofprey>eagles,hawks,vultures)

That is to say, this bundle of metadata designates

a concept map published in a tweet on Twitter and

giving a classificational description of a set of

concepts, namely birds of prey.

Another object might have these characteristics:

 Medium:Wikipage

 Representation:Writing

 Genre:History>historicalrecount

 Metafunction:Ideationalmeaning>events(Battle

oftheLittleBighorn),persons(Custer,Crazy

Horse),places(LittleBighornRiver);textual

meaning>externallink>elaboration

Here a piece of writing in a wiki page gives a

historical recount of the Battle of the Little Bighorn

involving certain persons and places and containing

a link to an external resource providing additional

information on the event.

In other words, this semiotic descriptive model

may be conceived of as a kind of extensible facetted

classification in which any learning component may

belong to several types and subtypes or be viewed

from several perspectives.

Now, further information may be attached to

these four main facets. For instance, didactical

metadata may be added to the learning object as a

representation to indicate degree of interactivity,

learning style, etc. Ideational meanings may be

expanded: events may be located in place and time,

persons may be described or depicted and places

may be encoded with geospatial coordinates.

Also, the metadata may describe nested

structures, as it were. For instance, in a concept map

classifying birds of prey, there might be images

attached to the various types of bird. These images

may themselves be subjected to semantic description

and annotation.

Semiotic metadata should not be seen as an

alternative to more traditional metadata schemes

such as Dublin Core, LOM or SCORM (e.g.

Robertson, 2011) but rather as an extension. They

make it possible, it is believed, to tie a learning

resource more closely to its domain and to specify in

greater detail what communicative intentions it

seeks to realize. Moreover, as argued below, they

may also have a role to play as part of the actual

learning design of the resource in which they are

embedded.

2.2 Encoding

It is beyond the scope of this short paper to discuss

the many possibilities of representing semiotic

metadata as embedded semantic markup in HTML5

documents using structured data formats such as

Microdata and RDFa (Lite). The following example,

however, suggests one simple approach. It makes

CSEDU2013-5thInternationalConferenceonComputerSupportedEducation

174

use of RDFa Lite syntax and the schema.org

vocabulary:

<!DOCTYPE HTML>

<html>

<article typeof="Article

CreativeWork/Medium/Wikipage

CreativeWork/Representation/Writing

CreativeWork/Genre/Report/DescriptiveRe

port">

<div

property="about/metafunction/ideational

Meaning" typeof="Person/General">

<h1 property="name">General Custer</h1>

<p>George Armstrong Custer(December 5,

1839 – June 25, 1876) was a United

States Army officer and cavalry

commander in the <span

property="performerIn"

typeof="Event"><span

property="name">American Civil

War</span></span> and the Indian Wars.

Raised in Michigan and Ohio, Custer was

admitted to West Point in 1858, where

he graduated last in his class.

However, with the outbreak of the Civil

War, all potential officers were

needed, and Custer was called to serve

with the Union Army.</p>

<img property="image"

src="http://upload.wikimedia.org/wikipe

dia/commons/thumb/8/83/Custer_Portrait_

Restored.jpg/250px-

Custer_Portrait_Restored.jpg" />

</div>

</article>

</body>

</html>

This snippet of text copied from Wikipedia is

marked up with common HTML5 elements such as

<body>, <article>, <p>, <h1> and <img> to indicate

structural components like paragraph, heading and

image. These elements contain semantic markup

based on schema.org classes and properties. They

formally state that this is an article about a person

whose name is Custer who was a performer in an

event whose name was the American civil war and

that it is indeed his image that is included in the text.

The schema.org vocabulary allows content

developers to extend and customize its categories

and properties. For instance, in the representation

above it is specified that Custer was not only a

person but in fact also a general

(

typeof="Person/General").

And it is this extension mechanism we have

employed to include semiotic metadata. To indicate

the medium, representation and genre of the

resource we have extended, or specialized, the

schema.org category of Creative Work and to signal

ideational meanings in this text, we have extended

the about property. (Search engines like Google's do

not know about these extended notions of general,

medium and ideational meaning and so on, of

course, but they will still be able to act on the core

categories and properties like person and about).

In figure 4 a semantic/semiotic representation of

the resource is shown when fed to an RDFa

processor (http://rdfa.info/play).

Figure 4: Semiotic representation of sample text.

For the sake of simplicity, only one vocabulary is

instantiated in the Custer example. But structured

data formats like Microdata and RDFa (Lite)

actually allow several schemas to be mixed in an

HTML5 document. For instance, in a longer

document we might have included categories and

properties from the FOAF vocabulary (“Friend of a

friend”) to specify General Custer’s relations to

other historical persons or referred to the Dublin

Core vocabulary to add details about the document

as a digital resource. And if the text had contained

any hypertext links, we might have pointed to

classes in the Salt Rhetorical Ontology to specify

their communicative function.

3 BENEFITS AND WIDER

PERSPECTIVES

What, then, are the benefits of marking up semiotic

resources, complexes of meaningful signs, in

learning materials? It may be argued that gains may

be achieved in the following areas:

Firstly, the quality of information retrieval will

no doubt be enhanced. It will eventually be possible

ProgressiveSemioticEnrichment-DesigningLearningContentMetadataforWeb3.0

175

to formulate more precise queries in search engines.

Queries like "Find a concept map in English

classifiying birds of prey and containing pictures of

them" are no longer but a futuristic dream. (Google's

Knowledge Graph is evidence of this trend).

Secondly, linking content components across

web sites may be done in more explicit and

principled ways. A hyperlink may denote an

ideational relation between two domain entities

(person A is the brother of person B) or a textual, or

communicative, relationship (paragraph A is a

summary of section B). And if global identiers are

used, this is effectively tantamount to exposing, or

attaching, learning objects to the Web of Data using

linked data (see Heath and Bizer, 2011).

Thirdly, the reuse of embedded content

components is also likely to become easier as these

will be "unbundled" to a greater extent.

Fourthly, embedded semiotic annotation may

provide an additional affordance, which has to do

with the learning potential of learning objects, rather

than just their retrieval, linking or reuse. The reason

is that such metadata may be construed, and utilized,

as what we may call semiotic enzymes, hidden

elements enabling learning designs to be

(dynamically) altered in various ways to cater for

different user preferences, learning styles, rendering

devices, etc. (see Johnsen, 2012). As an example,

inline semiotic markup may be used to actively

support one or more of Mayer's principles of

multimedia learning (Mayer, 2009), in particular his

"principle of signalling", i.e. the guideline

advocating the use of conceptual structure markers

in learning materials. Embedded semiotic tags could

be used, say, as source data for dynamically creating

graphic organizers, spatial arrangements intended to

visually map the conceptual or narrative structure of

a piece of text and hence facilitate its comprehension

(see Stull and Mayer, 2007).

And since semiotic encoding can be done using

standards like Microdata and RDFa (Lite), reusable

style sheets, templates or widgets processing these

semiotic metadata can be developed and shared on a

global scale, especially for widely used categories

like events, persons and places. For example, a

college professor publishing a history textbook on

the web might link the document to an external

widget creating a visual timeline based on the events

mentioned in the text. Or a learner might download a

browser plug-in to flag all occurrences of concepts

of interest when surfing the web.

This affordance opens a whole set of

opportunities that could, for lack of a better term, be

called "Learning Content Design as a Service"

(LCDaaS). The idea itself is simple: content

providers like professors and teachers will only have

to concentrate on constructing structured materials

("basic content") but will be able to link these

materials to (dynamic) designs and in this way

create richer and more engaging learning resources.

And users will have a greater say in deciding what

design options they want for the materials they study

(visual support, interactivity, etc.).

REFERENCES

Allsopp, J., 2009. Developing with Web Standards. Pearson

Education.

Bezemer, J. & Kress, G., 2008. Writing in Multimodal

Texts: A Social Semiotic Account of Designs for

Learning. Written Communication 2008 25: 166

Gustafson, A., 2011. Adaptive Web Design: Crafting Rich

Experiences with Progressive Enhancement.

Chattanooga, Tennessee: Easy Readers

Heath, T. & Bizer, C., 2011. Linked Data: Evolving the

Web into a Global Data Space (1st edition). Synthesis

Lectures on the Semantic Web: Theory and Technology,

1:1, 1-136. Morgan & Claypool.

Johnsen, L., 2012. HTML5, Microdata and Schema.org:

Towards an Educational Social-Semantic Web for the

Rest of Us? In Markus, H. & Martins, M. J. & Cordeiro,

J. (eds.): Proceedings of the 4th International

Conference on Computer Supported Education.

SciTePress - Science and Technology Publications.

Kress, G. & Van Leeuven, T., 2001. Multimodal Discourse.

The modes and media of contemporary

Communication. London: Arnold.

Martin, J. R. & Rose, D., 2008. Genre Relations: Mapping

Culture. Equinox Publishing ltd.

Mayer, R. E., 2009. Multimedia Learning. Cambridge

University Press. 2nd edition.

Ronallo, J., 2012. HTML5 Microdata and Schema.org.

Code{4}Lib Journal. http://journal.code4lib.org/

articles/6400

Robertson, R. J., 2011. Technical standards in education,

Part 5: Take advantage of metadata. IBM

DeveloperWorks. http://www.ibm.com/developer

works/industry/library/ind-edustand5/index.html?

cmp=dw&cpb=dwind&ct=dwnew&cr=dwnen&ccy=zz

&csr=031011.

Stull, A. T. & Mayer, R. E., 2007. Learning by Doing

Versus Learning by Viewing: Three Experimental

Comparisons of Learner-Generated Versus Author-

Provided Graphic Organizers. Journal of Educational

Psychology, 2007, Vol. 99, No. 4, 808–820.

Van Leeuwen, T., 2005. Introducing Social Semiotics.

London: Routledge

Vorvilas, G. & Thanassis, K. & Ravanis, K., 2011. A

Genre-based Framework for Constructing Content for

Learning Objects. Asian Journal of Computer Science

and Information Technology 1:1.

CSEDU2013-5thInternationalConferenceonComputerSupportedEducation

176