INCIDENT AND PROBLEM MANAGEMENT

USING A SEMANTIC WIKI-ENABLED ITSM PLATFORM

Frank Kleiner

, Andreas Abecker

and Marco Mauritczat

FZI Forschungszentrum Informatik, Haid-und-Neu-Str. 10-14, Karlsruhe, Germany

disy Informationssysteme GmbH, Erbprinzenstr. 4-12, Karlsruhe, Germany

Infomotion GmbH, Ludwigstr. 33-37, Frankfurt am Main, Germany

Keywords:

IT service management, Semantic Wiki, Ontology-based information systems, Collaborative ITSM.

Abstract:

IT Service Management (ITSM) is concerned with providing IT services to customers. In order to improve

the provision of services, ITSM frameworks (e.g., ITIL) mandate the storage of all IT-relevant information in

a central Conﬁguration Management System (CMS). This paper describes our Semantic Incident and Problem

Analyzer, which builds on a Semantic Wiki-based Conﬁguration Management System. The Semantic Incident

and Problem Analyzer assists IT-support personnel in tracking down the causes of incidents and problems in

complex IT landscapes. It covers two use cases: (1) by analyzing the similarities between two or more system

conﬁgurations with problems, it suggests possible locations of the problem; (2) by analyzing changes over

time of a component with a problem, possible conﬁguration changes are reported which might have led to the

problem.

1 INTRODUCTION

The increasing complexity of IT landscapes, paired

with an increasing dependency on IT services from

almost all functions within an organization has led

to new paradigms for managing information technol-

ogy. While for a long time, the technical aspects

of IT were the center of attention, the IT Service

Management (ITSM) approach centers around ser-

vices, while putting technical aspects into the back-

ground. Focusing on the customer helps to align

services provided by the IT department with an or-

ganization’s business goals (Addy, 2007; Clacy and

Jennings, 2007). There exist a number of frame-

works which give guidelines for the implementation

of ITSM-relevant functions and processes within or-

ganizations. Currently, the IT Infrastructure Li-

brary (ITIL) (Caetlidge et al., 2008; Cannon and

Wheeldon, 2007) is the most widely used general-

purpose IT Service Management framework. ITIL

consists of ﬁve volumes, which describe the lifecy-

cle of IT services. The topics addressed in this pa-

per make use of and build on the following ITIL pro-

cesses:

• Service Asset and Conﬁguration Management

(SACM) deals with maintaining a system for

managing all entities used for providing IT ser-

vices, including their relations to and dependen-

cies from each other. Entities are referred to as

Conﬁguration Items (CIs) and are stored in the

Conﬁguration Management System (CMS). Infor-

mation about CIs is stored in one or more Conﬁg-

uration Management Database(s) (CMDB), which

are part of the Conﬁguration Management Sys-

tem (Lacy and Macfarlane, 2007). A simple ex-

ample for CMDB entries goes as follows: a ser-

vice which is responsible for providing an organi-

zation’s Web site is provided by a Web Content

Management System running on an instance of

the Apache Web server on a certain computer. It

uses a MySQL database instance, which runs on

another computer. In order to communicate, the

computers are networked together by using a net-

work switch. The network switch is connected to

a router, which provides Internet access. As can

be seen, in order for the Web site to be available

(customer perspective), a number of services and

systems have to be running and reachable via the

network (technical perspective).

• Change Management is concerned with the ap-

plication of changes to an IT infrastructure (e.g.,

hardware, software, services) in a controlled man-

ner. The goal is to assess the potentials for prob-

lems with planned changes and to mitigate the

363

Kleiner F., Abecker A. and Mauritczat M..

INCIDENT AND PROBLEM MANAGEMENT USING A SEMANTIC WIKI-ENABLED ITSM PLATFORM.

DOI: 10.5220/0003751303630372

In Proceedings of the 4th International Conference on Agents and Artiﬁcial Intelligence (ICAART-2012), pages 363-372

ISBN: 978-989-8425-95-9

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

associated risks. Change Management depends

heavily on Service Asset and Conﬁguration Man-

agement for understanding dependencies between

IT components when planning changes (Lacy and

Macfarlane, 2007). Resuming the example, this

means that before upgrading the MySQL database

to a new version, it has to be made sure that it does

not break the database access of the Web Con-

tent Management System before performing the

upgrade.

• Incident Management is the process within the

ITIL framework which is responsible for ﬁxing

the causes of service interruptions. An incident in

ITIL is deﬁned as an “unplanned interruption to

an IT service or reduction in the quality of an IT

service” (Cannon and Wheeldon, 2007). The fo-

cus of the Incident Management process is clearly

on restoring the service as soon as possible, with-

out necessarily being concerned with the under-

lying cause of the incident (Cannon and Wheel-

don, 2007). For example, if the Web site stops

working, a solution within Incident Management

would be to restart the Web server or to rebuild a

broken database.

• Problem Management is concerned with ﬁnding

the underlying causes of service failures. Even if a

service was successfully restored within the Inci-

dent Management process, it has to be made sure

that the error which caused the outage will not re-

occur in the future (Cannon and Wheeldon, 2007).

For example, the cause of the database crash could

be a relatively rare error in the underlying hard-

ware, which has to be detected and which leads to

replacing the problematic hardware component.

In (Kleiner and Abecker, 2009; Kleiner et al.,

2009a), we motivated and described a collaborative,

semantics-enabled ITSM software support based on

the Semantic MediaWiki. In the works published so

far, we focussed on the Service Asset and Conﬁgu-

ration Management process above. The value-added

of that ITSM software infrastructure mainly comes

from the use of a Semantic Wiki as a “single point

of information” that collects and integrates manifold

kinds of information and knowledge about an orga-

nization’s hardware and software environment. In

this paper, we aim at drawing further beneﬁt from

the collected information in order to support addi-

tional ITSM processes, namely Incident Management

and Problem Management; by examining the stored

knowledge about conﬁguration items, their structure

and relationships, we want to analyse the causes for

occurring incidents or documented fault situations. If

such an analysis tool works well, the created new

knowledge about problem causes may also be ex-

ploited for Change Management. In this paper, we

present a prototypical tool with the desired function-

ality, called Semantic Incident and Problem Analyzer

(SemPA).

This paper is organized as follows: ﬁrst, a detailed

problem description is given in section 2, followed by

a description of our design in section 3. The imple-

mentation of the Semantic Incident and Problem An-

alyzer is described in section 4. In section 5, a con-

clusion is given, followed by a discussion of related

work and an outlook on future work.

2 PROBLEM DESCRIPTION

2.1 Application Environment

The solution presented in this paper addresses a num-

ber of real-life problems encountered in the daily op-

erations within the IT department of an SME orga-

nization with about 150 full-time employees as well

as 250 part-time employees. Because the organiza-

tion’s core business is mainly IT-centric, the knowl-

edge of the majority of its employees can be rated

high or very high. The IT department, which con-

sists of four full-time and eight part-time employees,

provides IT services mostly for in-house customers.

Services include the design and maintenance of the

network infrastructure (ethernet, wireless networks,

telephone, VoIP), email services, Web services, and

database services. Furthermore, it includes the cov-

erage of IT equipment through its life-cycle, e.g., the

acquisition, testing, commissioning, maintenance and

decommissioning of servers, desktop computers, and

notebooks. While key infrastructure components and

services are maintained by the IT department, em-

ployees are free to install software required for their

work on their workstations, as well as to set up ser-

vices for testing purposes within the internal network.

2.2 Incident Classes

After analyzing the incidents reported to our IT de-

partment’s help desk system and their underlying

causes, it was observed that there are three main

classes of incidents:

• Class 1: Multiple incidents with a common

cause. Two or more incidents occur, which are

related to each other by a common cause. An ex-

ample for this kind of incident is the failure of a

network switch, which leads to a number of users

having network problems.

ICAART 2012 - International Conference on Agents and Artificial Intelligence

364

• Class 2: Single incident evolved over time. An

incident occurs, where a single computer evolves

a problem which can be traced back to a change

which was applied to the computer by the user or

the IT department. An example is an upgrade to a

Web browser, which crashes when visiting a cer-

tain Web page, which loaded perfectly ﬁne before

the upgrade.

• Class 3: Stand-alone incident. An incident oc-

curs on a single system without a previous change

to any component. Examples are mostly found

when looking at hardware failures (e.g., a failed

harddisk or main-board).

Because incidents generally have an underlying

problem, which has to be found and ﬁxed in order

to prevent an error from reoccurring, tracking down

the cause of an incident means ﬁnding the incident’s

underlying problem. Depending on the class of the

incident, different techniques have to be applied and

different knowledge of the support personnel is re-

quired. In order to ﬁnd the cause of a class 1 incident,

a detailed knowledge of similarities between IT com-

ponents has to be possessed by the person assigned

to ﬁnd the problem. If this is not the case, informa-

tion has to be gathered from the CMDB, which can be

cumbersome if done manually. For class 2 incidents,

in some cases the user of a system can give valuable

hints by narrowing down the time interval when the

undesired behavior occurred for the ﬁrst time. Class 3

incidents are usually relatively easy to detect, because

they are limited to a single system.

In addition to the different classes, incidents with

the same class of underlying problem can occur on

different systems independently from each other and

distributed over time. This means that in order to

speed up locating problems, it is necessary to docu-

ment ﬁxed problems and make them searchable for

further use.

2.3 Requirements Analysis

After studying the different classes of incidents, the

requirements for a tool which helps in tracking down

the underlying problems were formulated. Because

class 3 incidents are on the one hand restricted to a

single system, which makes locating them relatively

easy, and on the other hand not detectable via com-

paring CIs, class 3 incidents are not pursued further

and partly left to mechanisms used for detecting fail-

ing equipment.

The requirements for the Semantic Incident and

Problem Analyzer are as follows:

• Ability to ﬁnd the cause of class 1 incidents by

comparing a given list of IT components for sim-

ilarities. E.g., to detect a failing network switch

from incidents reported by independent users in-

dicating a problem.

• Ability to ﬁnd the cause of class 2 incidents by

comparing conﬁgurations in time. For example, to

detect the cause of an incident report which states,

that a program was running ﬁne two days ago, was

not used yesterday and does not start today.

• Ability to ﬁnd problems which were ﬁxed on

the same or other computers in the past, e.g., a

browser update caused problems with a browser

plug-in, which happened again on another com-

puter, with another browser version and another

plug-in.

Before presenting the implementation of the Se-

mantic Incident and Problem Analyzer, an overview

of the Semantic Wiki-based IT Service Management

platform and of the ontology of the Semantic Incident

and Problem Analyzer domain, is given.

3 DESIGN

3.1 Wikis and Semantic Wikis

Wikis are Web sites which enable visitors to con-

tribute to their content by editing the content from

within their Web browsers. The Wikipedia

ency-

clopedia is a shining example for the possibilities

of Wikis. In the corporate context, Wikis are often

used for knowledge management for projects, as

portals, and as tools for project management. More

information about Wikis can be found in (Barrett,

2008; Ebersbach et al., 2007).

Semantic Wikis extend Wikis by adding fea-

tures to express semantic statements. These explicit

semantic statements allow a better processing of

Wiki articles and their relations by a computer. In

(Schaffert et al., 2008), an overview of characteristics

of Semantic Wikis as well as some examples for

implementations are given. The main characteristics

of a Semantic Wiki are (Schaffert et al., 2008):

• Semantic Wikis extend non-semantic Wikis by

adding means to express structured data, while

keeping the Wiki’s ﬂexibility and collaborative

working style. In order to achieve this goal,

“meta-data in the form of semantic annotations of

the Wiki pages themselves and of the link rela-

tions between Wiki pages” (Schaffert et al., 2008)

is supported.

http://www.wikipedia.org/

INCIDENT AND PROBLEM MANAGEMENT USING A SEMANTIC WIKI-ENABLED ITSM PLATFORM

365

• An ontology (Staab and Studer, 2009) forms the

underlying knowledge model of the Wiki. Added

and changed annotations in the Semantic Wiki

are reﬂected in the ontology. Usually, Ontologies

used by Semantic Wikis are represented in RDF

Schema or OWL (Allemang and Hendler, 2008),

which simpliﬁes the exchange of data with exter-

nal applications.

• Additional implicit information can be derived

from information present in the Wiki by using de-

ductive reasoning (Kr

otzsch et al., 2007).

• Semantic Wikis provide simple means for anno-

tating links, articles and other content.

• Attributes and relations can be used in queries

which extend the possibilities of queries from

simple keyword searches to more complex ones.

There is a number of different implementations

of Semantic Wikis

. The work presented in this pa-

per builds on top of the Semantic MediaWiki soft-

ware (Kr

otzsch et al., 2007; Vrande

c and Kr

otzsch,

2009). Semantic MediaWiki is an extension for the

popular MediaWiki software which is the technical

basis for Wikipedia and numerous other Wiki sites.

MediaWiki and Semantic MediaWiki were selected

because of their reliability, extensibility, and the qual-

ity of their documentation.

3.2 Semantic Wiki-based IT Service

Management Platform

When looking at the IT landscape of organizations,

it can be seen that there exists a multitude of differ-

ent hardware components (e.g., desktop computers,

servers, notebooks, network switches, routers, print-

ers), software components (e.g., operating systems,

Web server software, database server software) and

services (e.g., an organization’s Web site, email ser-

vice). In addition, there exist dependencies between

these CIs, e.g., an instance of an operating system is

running on a certain physical server, which is con-

nected to a certain network switch and is providing

a number of speciﬁc services. As shown in (Kleiner

and Abecker, 2009; Kleiner et al., 2009b), a Seman-

tic Wiki can be used as a Conﬁguration Management

System for managing information about CIs as well as

the relations between the CIs. In our scenario, each CI

is described in a Wiki article which means that each

computer and other hardware component has an as-

sociated Wiki page which lists its properties. In the

case of a computer, properties range from its name to

its serial number. Relations to other CIs are expressed

http://semanticweb.org/wiki/Semantic wiki projects

as relations between Wiki articles, e.g., the manufac-

turer of a computer has its own Wiki page, which is

linked to by all relevant CIs and which includes infor-

mation about the manufacturer relevant for delivering

IT services. In order to simplify the editing of struc-

tured information, the Semantic Forms

extension is

used, which provides a forms-based interface which

abstracts from the underlying semantic relations.

3.2.1 Architecture

Figure 1 gives an overview of the architecture of

the Semantic Wiki-based IT Service Management

platform. The core of the system is the Seman-

tic Wiki-based Conﬁguration Management System,

where all information relevant for providing IT ser-

vices is stored. This ranges from formal statements

about IT components in the form of attributes and re-

lations (e.g., a computer has a certain kind of graphics

adapter and is connected to a network switch) to free

text used to describe, for example, work processes

and best practices (Kleiner and Abecker, 2009). In

order to provide additional functionalities, extensions

can be added to the IT Service Management platform.

Figure 1 shows three extensions which are used to in-

teract with hardware and software components that

are part of our organization’s IT landscape. Further-

more, the Semantic Incident and Problem Analyzer is

shown, which is described in detail in this paper.

3.2.2 Conﬁguration Gathering

The ﬁrst extension is used for Conﬁguration Gath-

ering, i.e., the automatic acquisition of information

from computers and other IT components over the

network. At the current implementation stage, in-

formation can be read from Windows computers via

the Windows Management Instrumentation (WMI)

(Jones, 2007) infrastructure. This enables to auto-

matically read information about hardware compo-

nents of a computer (e.g., its graphics adapter, CPU,

RAM, harddisk, network adapter), installed software,

and conﬁgurations (e.g., the computer’s network ad-

dresses). Another supported protocol is the Simple

Network Management Protocol (SNMP) (Case et al.,

1990) which is used to communicate with network

hardware (e.g., network switches) and other compo-

nents (e.g., printers) (Kleiner et al., 2009b).

3.2.3 Intrusion Detection

A part of ensuring the security of an organization’s

computer networks is the detection of suspicious ac-

http://www.mediawiki.org/wiki/Extension:Semantic

Forms

ICAART 2012 - International Conference on Agents and Artificial Intelligence

366

Figure 1: Overview of the Semantic Wiki-based IT Service Management Platform.

tivities by using a network intrusion detection sys-

tem. These systems scan network trafﬁc for pat-

terns which indicate malicious activities (Northcutt

and Novak, 2002). The Intrusion Detection exten-

sion interacts with the Open Source network intrusion

detection tool Snort (Roesch, 1999) in order to inte-

grate intrusion detection events into the IT Service

Management Wiki. By using the knowledge stored

in the Semantic Wiki-based Conﬁguration Manage-

ment System, events from the intrusion detection sys-

tem can be better classiﬁed as relevant or not relevant

than without this knowledge. Rules are used to sort

out events which do not represent a threat to the at-

tacked systems. At the moment, the intrusion detec-

tion extension is under development, an evaluation of

the extension will be performed within the next six

months.

3.2.4 Systems Monitoring

Another aspect of ensuring the delivery of services is

the monitoring of service availability. The Systems

Monitoring extension interacts with Nagios (Barth,

2005), an Open Source systems monitoring tool. Na-

gios tests services for availability by sending requests

and analyzing the answers. Our systems monitor-

ing extension simpliﬁes the administration of Nagios

by creating Nagios conﬁguration ﬁles from informa-

tion stored in the Semantic Wiki-based Conﬁgura-

tion Management System. Furthermore, information

about failed services is presented within the Wiki in-

terface (Kleiner et al., 2009a).

3.2.5 Semantic Incident and Problem Analyzer

The Semantic Incident and Problem Analyzer, which

is presented in detail in this paper, uses information

about CIs which is stored in the Semantic Wiki-based

Conﬁguration Management System. IT support per-

sonnel is able to query the Semantic Incident and

Problem Analyzer when suspecting class 1 or class 2

incidents. The Semantic Incident and Problem An-

alyzer is embedded into the Wiki interface, which

means that support personnel uses the same interface

as when documenting changes or looking up informa-

tion about CIs.

3.3 Ontology

This paragraph describes the structure of the ontol-

ogy, which was developed as the data model for the

Semantic Incident and Problem Analyzer. In the text,

class names are printed in bold, while relations are

printed in italic. One of the main classes of the ontol-

ogy is Computer System. It represents any computer

in operation in an organization, either physical or vir-

tual. Subclasses are Desktop Computer, Notebook

Computer, Server Computer, and Virtual Com-

puter. Computer systems are composed of hard-

ware components, which are modeled in the Hard-

ware Component class. The relation has Hardware

is used for stating which hardware components com-

prise which computer system. Examples for hardware

components are CPU, Graphics Card, Main Board,

RAM, Harddisk, and Network Adapter. The Net-

INCIDENT AND PROBLEM MANAGEMENT USING A SEMANTIC WIKI-ENABLED ITSM PLATFORM

367

work Adapter class is part of the domain of the con-

nected to Network Component property, which has

the class Network Equipment as its range. Another

part of the domain of the property connected to Net-

work Component is the class Network Equipment

itself, which enables the modeling of network equip-

ment being connected to other network equipment,

e.g., network switches connected to each other or a

router. Subclasses of Network Equipment are Fire-

wall, Network Switch, Router, and Wireless Ac-

cess Point. Computer systems and network equip-

ment are located at a certain location, which is ex-

pressed by the located in relation. Locations are rep-

resented by the Location class, which has Building,

Room, and Server Rack as subclasses. Server racks

are located in rooms, which is expressed by located

in Room, while rooms are located in buildings, which

is expressed by the relation located in Building. Soft-

ware is modeled in the Software class, with Applica-

tion Software and Operating System as subclasses.

The relation has installed Software between the Com-

puter System and the Application Software class in-

dicates, which software is installed on a computer.

The property has Operating System has Computer

System as domain and the class Operating System

as its range. The relation was introduced in addi-

tion to the has installed Software property to address

the special status of operating systems on computers.

Services are modeled in the Service class. Computer

systems provide services, which is expressed by the

use of the provides Service relation. Computer sys-

tems and services have one or more owners, which is

stated by the has Owner relation. The Standard class

is used to model all kinds of standards, e.g., RAM

standards, or network standards. The associated re-

lation is has Standard, with the classes Hardware

Component, Service, Network Equipment, Com-

puter System, and Software as domain. The man-

ufacturer of hardware and software is expressed in

the class Manufacturer and the relation has Manu-

facturer. Incidents and problems are modeled in the

classes Incident and Problem. Incidents can be re-

lated to other incidents, or to problems, which is ex-

pressed by the related to Incident relation. Accord-

ingly, problems can be related to other problems or

incidents. Problems and their solutions are modeled

with the has Solution relation, with the class Problem

as domain, and the class Solution as range.

4 IMPLEMENTATION

The implementation of the Semantic Incident and

Problem Analyzer is built on top of Semantic Media-

Wiki’s extension architecture. A Special Page exten-

sion, realized in PHP provides the foundation for the

process of ﬁnding similarities between conﬁguration

items. Two different operation modes have been im-

plemented to ﬁnd problems of conﬁguration items:

• The ﬁrst idea relies on the assumption that once

two or more different Conﬁguration Items show

an identical problem, the cause is possibly iden-

tical as well. A comparison of equalities of all

problem-affected Conﬁguration Items will lead to

a set of properties which all affected Conﬁgura-

tion Items have in common. While more than

one Conﬁguration Item is needed for this opera-

tion mode of the Semantic Incident and Problem

Analyzer to work, it is of special interest for Prob-

lem Management.

• The second approach to ﬁnd possible causes for a

speciﬁc problem can be used especially in cases,

where only a single Conﬁguration Item shows a

certain problem. By comparing a single Con-

ﬁguration Item’s properly working conﬁguration

with the conﬁguration after a problem emerged,

changes which may have caused the malfunction

should become obvious. This procedure is of

particular interest for Incident Management, as a

single malfunctioning Conﬁguration Item is sufﬁ-

cient to use this structured approach for tracking

down the incident’s cause.

Independent of which approach is used, the set of

identiﬁed properties can be used as a structured entry

point for further manual troubleshooting. The imple-

mentation of these two ideas is mainly identical:

1. Semantic relations and individuals are recursively

retrieved from the Semantic MediaWiki in order

to build a tree-like data structure of every origi-

nating CI and all succeeding CIs. Whether a CI

succeeds another CI is deﬁned through semantic

properties inside the Semantic MediaWiki. The

nodes of such a tree structure hold the name of an

individual and the class it is an instance of. The

connecting edges between the nodes represent a

property between two individuals, thus forming a

set of semantic triples between a node and all suc-

ceeding nodes, as provided by the Semantic Me-

diaWiki’s annotated links. Care was taken to de-

tect cyclic relations between individuals, in order

to prevent the generation of inﬁnite trees.

2. Each tree structure is successively compared with

an initially empty compare tree in an depth-ﬁrst

search manner. For each node it is checked

whether the triple it forms together with its suc-

ceeding node and the property-edge is already

present at the actual level in the compare tree. If

ICAART 2012 - International Conference on Agents and Artificial Intelligence

368

the triple is found, the compare tree increases a

counter for the object node of that triple, in or-

der to represent a match with a previously added

node. In case the triple is not yet present in the

compare tree, the algorithm clones the triple and

adds it to the compare tree. At this point it is in-

spected if the preceding node already has a rela-

tion to an individual of the same class with the

same property like the newly added triple. In Fig-

ure 2 this is the case when adding the node marked

with “B”, because node “A” was added previously.

This step is necessary to detect similarities be-

tween the trees at a deeper level. Even though

the recently added node (B) might not be equal to

any already present nodes at that level, it is pos-

sible that another individual of the same class has

a similar subtree (A). Therefore the subtree of the

recently added node (B) would be compared to all

similar node’s subtrees as well (A). An example

for similarities in subtrees are two graphics cards,

which have the same chip vendor, but use a dif-

ferent chip set and are manufactured by different

graphics card manufacturers. While in the pre-

sented example, the tree depth is two, trees with

a greater depth occur in more complex scenarios.

For example, when comparing hosts in a network,

including connections between network switches

and routers, trees depths of seven were found in

the test environment. In more complex environ-

ments, in some cases it might make sense to limit

the tree depth in order to limit the processing time

when using the analyzer.

3. The compare tree contains the union of all triples

of all trees which have been compared to each

other. The occurrence of correspondences be-

tween triples can easily be identiﬁed through the

size of the matches counter. During the ﬁnal third

step, the compare tree is visualized using the Me-

diaWiki extension GraphViz

. DOT

code is gen-

erated and gets passed to GraphViz which returns

an appropriate bitmap ﬁle containing the tree vi-

sualization. It can be parameterized which nodes

should be displayed according to their matches

counter. After a comparison of n different conﬁg-

uration items, ﬁrst of all one would be interested

in nodes which have a matches counter of n-1, be-

cause these nodes were part of all n trees. Thus

it can easily be achieved to hide potentially un-

interesting nodes and improve the readability of

the graphical tree representation as shown in Fig-

ure 2. The more matches a node has, the stronger

it gets colorized, dependent on the selected color

http://www.mediawiki.org/wiki/Extension:GraphViz

http://www.graphviz.org/doc/info/lang.html

scheme. For improved usability, nodes can be

clicked to open their Wiki page.

Figure 2 shows the screenshot of a comparison be-

tween two different conﬁguration items. The visual-

ized tree indicates through the matches counter inside

the node MDT 1024 equaling 1 (and all its succeed-

ing nodes), that the two previously compared conﬁg-

uration items are both connected with this conﬁgura-

tion item through the same property. The two nodes

names Asus with the matches counter set to 1 both

bear the information, that both conﬁguration items

have the property has hardware connecting them with

an instance of Graphics card. While the instances of

Graphics card are different from each other, each in-

stance in return has an semantic relation to the same

instance of the class Manufacturer, namely Asus. One

could therefore read the matches counter equaling 1 in

the Asus node as “The compared conﬁguration items

have a mutually different graphics card of the same

manufacturer.” In order not to loose the information

about the actual instances of the class Graphics card,

both unique instances Asus ATI Radeon HD 2400 XT

and Asus Pro Gamer are present in the tree.

In case a single conﬁguration item was compared

with itself at a different point of time, the visualization

needs to emphasize those nodes with zero matches.

While the properties of the conﬁguration which did

not change will have exactly one match, zero matches

indicate that either before or after the problem oc-

curred this property of the conﬁguration was differ-

ent.

5 CONCLUDING REMARKS

5.1 Summary and Related Work

This paper presented our Semantic Incident and Prob-

lem Analyzer which assists IT-support personnel in

tracking down problems in complex IT landscapes.

Built on top of Semantic MediaWiki, it makes use

of the information about Conﬁguration Items stored

in the Wiki. The application scope of the Semantic

Incident and Problem Analyzer can be divided into

two main scenarios: ﬁrst, problems which occur at

roughly the same time period on different hardware

components and can be narrowed down to a common

cause; second, problems which occur on the same

hardware component and can be tracked down by

comparing the conﬁgurations of the component at dif-

ferent times. We have designed an ontology for prob-

lem analysis, which will be extended towards a com-

prehensive ontology for IT landscapes and IT Service

Management support.

INCIDENT AND PROBLEM MANAGEMENT USING A SEMANTIC WIKI-ENABLED ITSM PLATFORM

369

Figure 2: Screenshot of the Semantic Incident and Problem Analyzer.

With regard to using a Semantic Wiki in the IT

Service Management domain, the work presented in

(Alquier et al., 2009) describes a Semantic Wiki-

based Knowledge Management System, which is

used for Asset and Conﬁguration Management, docu-

mentation, as a self-help system, and for system out-

age tracking. As far as documented in that paper, the

authors follow exactly the same line of thinking as

we do in our prior work (Kleiner and Abecker, 2009),

i.e. they support the SACM process of ITIL; but they

do not offer any help in Incident or Problem Manage-

ment which is our focus with the SemPA extension.

In a recent communication, (Lane, 2010b) also sup-

ported the idea of an SMW-based ITSM infrastruc-

ture; in (Lane, 2010a), an SMW-integrated ticketing

system is discussed which could be a useful comple-

ment for our approach. However, for the time be-

ing, our organization’s requirements are satisﬁed by

the existing OTRS ticketing system. In the future, the

beneﬁts of a deeper semantic integration of the ticket-

ing system may be examined.

Regarding related work in the application task, we

are, of course, well aware that the task of diagnosing

problems in complex technical systems has been thor-

oughly investigated in the area of knowledge-based

systems (KBS) for many years (see, e.g., (Darling-

ton, 1999)). However, it is also a long-standing and

not yet satisfactorily solved problem to widen KBSs’s

knowledge acquisition bottleneck. This is the reason

that many technology providers and researchers in the

KBS area also produce, since a number of years, in-

telligent advisory and assistant systems in order to

ﬁnd the sweet spot in the trade-off between power and

maintainability of software systems. In this sense, the

Semantic Incident and Problem Analyzer represents a

light-weight approach which totally avoids the prob-

lem of upfront knowledge engineering and simply ex-

ploits the already existing IT Service Management

knowledge base to locate potential problem causes—

thus helping the human problem-solver to analyze the

ICAART 2012 - International Conference on Agents and Artificial Intelligence

370

situation and navigate through the problem space. In

this way, the system can start-up almost without any

speciﬁc additional effort. If the system is in place for

some time, it may be an option to gradually enrich the

system’s capabilities, for instance, by fault diagnosis

heuristics derived from prior problem cases.

5.2 Implementation Status and Next

Steps

The Semantic Incident and Problem Analyzer is cur-

rently in productive use and evaluated in a produc-

tive environment with about 500 computers. An early

qualitative evaluation analyzing a few problems previ-

ously encountered in productive use, has shown that

the time needed for tracking down problems can be

reduced by the Semantic Incident and Problem An-

alyzer. The amount of time reduction is related to

the knowledge about the details of the IT landscape

of the person working on the problem, with inexpe-

rienced personnel beneﬁtting more from the Problem

Analyzer than experienced employees.

However, before having a reasonable critical mass

of usage experience for a valid quantitative evalua-

tion, a several-months period of operational use will

probably be required. At the moment, a handful of

full-time and part-time employees of our IT-support

personnel is using the whole Semantic Wiki-based IT

Service Management solution (sketched above in sec-

tion 3) for their daily work. This leads already to a

more collaborative work-style and allows for more

agile and light-weight IT Service Management pro-

cesses than typically asked for in IT Service Man-

agement endeavors. Next steps of the system roll-out

will comprise to open the platform for technologically

knowledgeable end users, thus enabling some IT Ser-

vice Management self-service offerings. This must

be accompanied by appropriate considerations about

usage incentives such that the self-service IT Service

Management portal will offer win-win situations for

end users and for IT support personnel as well. In

the long-term, we hope that we can study kind of col-

laborative knowledge creation and exchange through

the IT Service Management Wiki, especially regard-

ing observed system failures and problematic conﬁg-

urations. The so-collected experience about observed

problems will be the basis for an increasing useful-

ness of the Semantic Incident and Problem Analyzer.

Seen from the Artiﬁcial Intelligence point of view,

the presented solution in its current status is certainly

a ”lightweight semantics” solution proﬁting not so

much from deep and sophisticated ontologies and au-

tomated inferences; instead, it is a ﬁrst, pragmatic

and practically useful solution which puts a Seman-

tic Wiki into a daily-business context, addressing a

widespread application problem; before coming to the

sophisticated stuff, we ﬁrst implemented all necessary

interfaces and connectors to integrate the system into

a real-world environment in its full complexity.

In this application context, our ﬁrst beneﬁts are

based on the simple and easy-to-use features for col-

laboratively creating and editing in a browser-based

style, a knowledge base, in the case of a Semantic

Wiki containing both structured (i.e., data) and un-

structured (i.e., text and multimedia documents) in-

formation. The second beneﬁt is realized through the

”integrative power” of the Semantic Wiki which is

an extremely ﬂexible and open tool with an expres-

sive data model that allows to integrate practically all

kinds of ITSM-relevant data, information, and knowl-

edge in the appropriate way. To do so, a ﬁrst reusable

result of our work is the ITSM ontology that com-

bines all ITSM-related aspects of hardware, software,

infrastructure and organizational aspects and which

can serve of kind of a reference ontology for semantic

ITSM / CMDB applications.

In the presented application for ﬁnding the

possible causes of system problems, simple tree-

comparison algorithms have been applied to the for-

mally represented CMDB data. In a traditional AI-

approach, one would probably have tackled such

problems by a rule-based or a case-based expert sys-

tem. While the former can only be built when a rea-

sonable account of expert knowledge is available and

formalized (which is not always the case, but is al-

ways expensive), the latter is also applicable in cases

where the problem-analysis knowledge is just slowly

growing with experience and may change often over

time; and this is exactly the case for our applica-

tion scenario which can very easily be sorted into

the case-based reasoning paradigm, but with a very

lightweight knowledge-representation approach that

allows for simple and efﬁcient analysis algorithms. In

later extensions of our system, we may also investi-

gate the usefulness of complex similarity measures to

assess the usefulness of slightly different stored infor-

mation. But this only makes sense when some longer

user experience will have shown which kinds of prob-

lem causes can be found by the system and which

ones cannot.

One further system extension is probably more

nearby and more obviously useful: If longer use of

the system will have identiﬁed a number of problem-

atic system conﬁgurations, these conﬁgurations (or,

abstractions of them) can be converted into forbid-

den conﬁguration patterns which could proactively be

tested for when new systems are conﬁgured or new

software is installed. In this manner, known prob-

INCIDENT AND PROBLEM MANAGEMENT USING A SEMANTIC WIKI-ENABLED ITSM PLATFORM

371

lems can be avoided before they are repeated. If we

are analysing system change logs over time, we might

even be able to identify forbidden conﬁguration paths

(sequences of actions) that might proactively be tested

for by Complex-Event-Processing machinery.

REFERENCES

Addy, R. (2007). Effective IT Service Management: To ITIL

and Beyond! Springer, Berlin, 1st edition.

Allemang, D. and Hendler, J. A. (2008). Semantic Web for

the Working Ontologist: Effective Modeling in RDFS

and OWL. Morgan Kaufmann, Burlington, 2nd edi-

tion.

Alquier, L., McCormick, K., and Jaeger, E. (2009). knowIT,

a Semantic Informatics Knowledge Management Sys-

tem. In Proceedings of the 5th International Sympo-

sium on Wikis and Open Collaboration, WikiSym ’09,

pages 20:1–20:5, New York, NY, USA. ACM.

Barrett, D. J. (2008). MediaWiki (Wikipedia and Beyond).

O’Reilly, Sebastopol, 1st edition.

Barth, W. (2005). Nagios: System and Network Monitoring.

No Starch, San Francisco, 1st edition.

Caetlidge, A., Hanna, A., Rudd, C., Macfarlane, I., Winde-

bank, J., and Rance, S. (2008). An Introductory

Overview of ITIL V3. http://www.itsmﬁ.org/content/

introductory-overview-itil-v3-pdf.

Cannon, D. and Wheeldon, D. (2007). Service Operation

ITIL, Version 3 (ITIL). Stationery Ofﬁce Books, Nor-

wich.

Case, J., Fedor, M., Schoffstall, M., and Davin, J. (1990).

Simple Network Management Protocol (SNMP). RFC

1157 (Historic).

Clacy, B. and Jennings, B. (2007). Service Management:

Driving the Future of IT. Computer, 40(5):98–100.

Darlington, K. (1999). The Essence of Expert Systems (The

Essence of Computing Series). Prentice Hall, Upper

Saddle River.

Ebersbach, A., Glaser, M., Heigl, R., and Warta, A. (2007).

Wiki: Web Collaboration. Springer, Berlin, 2nd edi-

tion.

Jones, D. (2007). VBScript, WMI, and ADSI Unleashed:

Using VBScript, WMI, and ADSI to Automate Win-

dows Administration (Unleashed). Addison-Wesley

Longman, Amsterdam, 2nd edition.

Kleiner, F. and Abecker, A. (2009). Towards a Collaborative

Semantic Wiki-based Approach to IT Service Man-

agement. In Paschke, A., Weigand, H., Behrendt, W.,

Tochtermann, K., and Pellegrini, T., editors, Proceed-

ings of I-SEMANTICS ’09, 5th International Confer-

ence on Semantic Systems.

Kleiner, F., Abecker, A., and Brinkmann, S. F. (2009a).

WiSyMon – Managing Systems Monitoring Informa-

tion in Semantic Wikis. In Advances in Semantic

Processing, 2009. SEMAPRO ’09. Third International

Conference on, pages 77–85.

Kleiner, F., Abecker, A., and Liu, N. (2009b). Auto-

matic Population and Updating of a Semantic Wiki-

based Conﬁguration Management Database. In Fis-

cher, S., Maehle, E., and Reischuk, R., editors, In-

formatik 2009 – Im Focus das Leben, volume P-154.

ollen, Bonn.

otzsch, M., Schaffert, S., and Vrande

c, D. (2007). Rea-

soning in Semantic Wikis. In Antoniou, G., Aßmann,

U., Baroglio, C., Decker, S., Henze, N., Patranjan, P.-

L., and Tolksdorf, R., editors, Reasoning Web, volume

4636 of Lecture Notes in Computer Science, pages

310–329. Springer.

otzsch, M., Vrande

c, D., V

olkel, M., Haller, H., and

Studer, R. (2007). Semantic Wikipedia. Web Se-

mantics: Science, Services and Agents on the World

Wide Web, 5(4):251–261. World Wide Web Confer-

ence 2006, Semantic Web Track.

Lacy, S. and Macfarlane, I. (2007). Service Transition, ITIL,

Version 3 (ITIL). Stationery Ofﬁce Books, Norwich.

Lane, R. (2010a). Creating a simple ticketing system with

Semantic MediaWiki. http://ryandlane.com/blog/

2010/04/01/creating-a-simple-ticketing-system-with-

semantic-mediawiki/.

Lane, R. (2010b). Helpdesk system and datacenter in-

ventory Semantic MediaWiki prototypes added to

my prototype wiki. http://ryandlane.com/blog/2010/

03/29/helpdesk-system-and-datacenter-inventory-sem

antic-mediawiki-prototypes-added-to-my-prototype-

wiki/.

Northcutt, S. and Novak, J. (2002). Network Intrusion De-

tection: An Analysts’ Handbook. New Riders, Berke-

ley, 3rd edition.

Roesch, M. (1999). Snort - Lightweight Intrusion Detection

for Networks. In LISA ’99: Proceedings of the 13th

USENIX Conference on System Administration, pages

229–238, Berkeley, CA, USA. USENIX Association.

Schaffert, S., Bry, F., Baumeister, J., and Kiesel, M. (2008).

Semantic Wikis. IEEE Software, 25(4):8–11.

Staab, S. and Studer, R. (2009). Handbook on Ontologies.

(International Handbooks on Information Systems).

Springer, Berlin, 2nd edition.

Vrande

c, D. and Kr

otzsch, M. (2009). Semantic Media-

Wiki. In Davies, J., Mladenic, D., and Grobelnik,

M., editors, Semantic Knowledge Management, pages

171–179. Springer, Berlin.

ICAART 2012 - International Conference on Agents and Artificial Intelligence

372