An Integrated Data Management for Enterprise Systems

Martin Boissier, Jens Krueger, Johannes Wust and Hasso Plattner

Hasso Plattner Institute for Software Engineering, University of Potsdam, Potsdam, Germany

Keywords:

Shearing Layers, Integrated Data Management, Enterprise Architectures.

Abstract:

Over past decades, higher demands on performance for enterprise systems have led to an increased architec-

tural complexity. Demands as real-time analytics or graph computation add further complexity to the tech-

nology stack by adding redundancy and distributing business data over multiple components. We argue that

enterprises need to simplify data management and reduce complexity as well as data redundancy. We propose

a structured approach using the shearing layer concept with a uniﬁed data management to improve adaptability

as well as maintainability.

1 INTRODUCTION

Enterprise systems are complex compositions of mul-

tiple subsystems, which provide crucial business ca-

pabilities as ﬁnancials, material management, busi-

ness intelligence, customer relationship management,

and much more. The architecture of enterprise sys-

tems has undergone many changes over the past

decades ranging from ﬁrst systems that ran on a sin-

gle database to current setups with arrays of parallel

systems and data stores. Such architectures consist-

ing of multiple components, each storing its own (re-

dundant) data set and with a very particular scope of

duties, is what we call a Separated Enterprise Archi-

tecture (SEA). Many different demands led to such an

architecture, e.g., the demand for analytical capabili-

ties that introduced data warehouses into enterprise

systems or more recent demands as graph computa-

tion capabilities or closed-loop concepts.

Meeting demands by adding separate components

facilitates high adaptability in contrast to modifying

or even re-engineering existing systems. This way,

new requirements are met in time while avoiding

modiﬁcations on business related and mission critical

systems.

But in the long run, not only maintainability of

redundant systems becomes increasingly expensive.

Also adaptability is impaired as data exchange be-

tween components and data consistency gets increas-

ingly complex in separated architectures. The diver-

sity of data models and the inherent data redundancy

between transactional and analytical systems there-

fore results in economical penalties through higher

maintenance costs and reduced ﬂexibility. In the end,

it decelerates the progression of the system, because

every additional component and layer increases data

latency and data redundancy.

This paper discusses the problems of separated en-

terprise architectures and proposes a re-engineering

of enterprise systems with focus on adaptability and

long-term maintainability. Using the shearing layer

concept, components are identiﬁed by their rate of

change and integrated into the architecture accord-

ingly. Such architectures lay the foundation for fast

iterating adaptations, while keeping mission critical

systems stable and achieving a consistent view on

business relevant data. We call data management

based on such an architecture Integrated Data Man-

agement, in which a single database system is em-

ployed for all components of the enterprise system.

The remainder of the paper is structured as fol-

lows. The progression of enterprise systems is dis-

cussed in Section 2, showing how different demands

and requirements affected their architectures. Related

work proposing different approaches to the increasing

demands is presented in Section 3. Section 4 presents

the idea of an integrated data management using the

shearing layer concept. Presenting recent achieve-

ments in the ﬁeld of database research, Section 5 dis-

cusses the feasibility of an integrated data manage-

ment for enterprise systems. Section 6 discusses how

the shearing layer concept can be applied to enterprise

systems’ using the presented technologies. In Section

7, this paper closes with a conclusion.

410

Boissier M., Krueger J., Wust J. and Plattner H..

An Integrated Data Management for Enterprise Systems.

DOI: 10.5220/0004897204100418

In Proceedings of the 16th International Conference on Enterprise Information Systems (ICEIS-2014), pages 410-418

ISBN: 978-989-758-029-1

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

2 CURRENT ENTERPRISE

ARCHITECTURES

The design of enterprise systems changed steadily

over the last decades. First enterprise systems de-

ployed a single database to run all business applica-

tions, cover business processes, and persist data. With

the increasing demand for long running queries in an-

alytical tasks, enterprise systems have been separated.

To handle different workloads the applications were

separated into transactional and analytical systems.

Hereby the transactional system is run using write-

optimized databases (usually traditional RDBMS),

while data required for analytical queries has been

transferred to additional systems, e.g., Enterprise

Data Warehouses (EDW). The increasing demand for

fast analytical reports and the ability to process long-

running analytical queries have made EDWs an in-

dispensable component in business environments and

also have vastly effected their design. The ﬁrst gen-

eration of enterprise data warehouses were imple-

mented using traditional row-oriented RDBMSs with

aggregated data to process analytical queries, simply

to ofﬂoad work from the transactional system. Later

EDWs persist data using adapted data schemes and

storage engines optimized for analytical workloads

(e.g. star schemes and column-oriented databases).

The data transfer between transactional and ana-

lytical systems is done via ETL-processes (Extract,

Transform and Load). The ETL-process usually pro-

cesses only a predeﬁned subset of the transactional

data since many attributes of business entities are not

relevant for analytical workloads. This subset of data

is then batch-copied, ﬁltered, and aggregated to opti-

mize the data for analytical queries. Consequently the

data timeliness in an EDW is dependent on the char-

acteristics of the ETL-process, especially the intervals

in which the process is run

Besides transactional and analytical systems, ad-

ditional intermediate systems have been integrated.

Increasing demands for new sources of information

(e.g., caused by new components as online shops) and

reporting on transactional data (e.g., best selling prod-

uct of the last three days) introduced Operational Data

Stores (ODS) (Inmon, 1999) as an intermediate sys-

tem between the transactional system and the EDW.

Operational Data Stores integrate different sources

using data cleansing, ETL, and integrity checking.

Furthermore, they provide the capability to run rather

simple analytics on transactional data in right- or even

real-time, usually on much smaller data volumes as in

EDWs. With the competence of data integration and

short-term analytics, operational data stores became a

central component in business environments.

We refer to such architectures as Separated En-

terprise Architectures due to the separation of data

and business logic among different components (i.e.

transactional systems, operational data stores, data

warehouses, et cetera).

2.1 Advantages of Separated Enterprise

Architectures

The need for the component based design of today’s

separated architectures arose due to insufﬁcient per-

formance for analytical workloads and timely data

integration. De-normalized data structures, pre-

computed aggregations, and narrowed tables are the

foundation for analytical systems, which were able

to process complex analytical queries with sufﬁcient

performance in contrast to transactional systems.

Another concern has been the integration of het-

erogeneous data sources. Besides the traditional

transformation of transactional sources into analytics-

optimized data, new data sources as point of sales

data, online stores, or external vendor data needed

to be integrated. With analytical components running

at full load, a timely integration of data from several

different sources has negative effects on query perfor-

mance.

The most important advantage of a separated ar-

chitecture is the comparatively easy addition of func-

tionality without the need to modify mission critical

systems.

2.2 Drawbacks of Separated Enterprise

Architectures

Over the years the difference between transactional

and analytical systems has steadily grown. With

their analytics-optimized data characteristics EDWs

are not capable of processing the majority of oper-

ational tasks. The reason is the irrelevance of cer-

tain transactional data for analytical tasks, which are

required for operational tasks but neglected during

ETL-processes. While this optimization improves an-

alytical performance signiﬁcantly, it hinders the data

ﬂow between components due to different data mod-

els in different granularities.

The demand for high-performance analytics on

transactional data – a matter ODSs were initially de-

signed for – is one of the main challenges of enterprise

systems and a reason for the increasing complexity.

One example is credit card fraud detection that relies

on large data sets to accurately recognize fraudulent

patterns. Such approaches require fast access to vast

amounts of data while, in parallel, requiring access to

the most recent transactional data to recognize fraud

AnIntegratedDataManagementforEnterpriseSystems

411

as early as possible. These requirements effect the

usage of ODS since narrow time constraints are pro-

hibiting any delay caused by intermediate systems.

Talking about Business Activity Monitory (BAM, see

Section 3) Golfarelli et al. wrote:

[... ]a tool capable of right-time [...] data

streams. In practice, in most cases this re-

quires to abandon the ODS approach typically

pursued in DW systems and to adopt on-the-

ﬂy techniques, which raises serious problems

in terms of data quality and integration.

(Golfarelli et al., 2004)

Closed-loop Concepts. A development that makes

operational data stores dispensable is the prevalence

of closed-loop concepts. With the horizontal as

well as vertical connection of business components

in modern systems, central messaging and data inte-

gration components present a potential bottleneck for

data timeliness. Hence ODS are counterproductive

for loose coupled and ﬂexible architectures which are

leveraging closed-loop approaches.

Furthermore, the evolution of separating business

components with separated data and responsibilities

contradicts the rather modern concept of close-loop

functions, which cyclically integrate data between

different data sources (Mangisengi and Huynh, 2008)

(e.g., to enrich transactional systems with analytical

insights from business intelligence systems).

Data Redundancy. Another challenge is data re-

dundancy and complexity added by evading the draw-

backs of transactional systems. One reason is that

traditional relational database management systems

(RDMBSs) have signiﬁcant shortcomings concerning

the ability to process analytical tasks. Complex tasks

as dunning or available-to-promise (ATP) checks re-

quire ﬁne-grained data (i.e., transactional data) and

perform long-running OLAP-style queries (Krueger

et al., 2010b). To provide sufﬁcient performance for

such long-running queries in transactional systems

additional data structures as materialized views or in-

dices are used to compensate for the shortcomings of

write-optimized databases used in transactional sys-

tems. But those data structures introduce new chal-

lenges to the system. Materialized aggregates lead to

a higher synchronization effort as multiple write oper-

ations are performed for each modiﬁcation of a busi-

ness entity, similar to the drawbacks of indices.

Looking at the evolution of separated architec-

tures a problem of adding functionality by adding

components becomes apparent. Additional compo-

nents provide advantages in respect to timeliness of

data and performance only as long as the added sys-

tems – including the additional network latency – per-

form better than the originating source system. In the

long run, the source systems will converge to the per-

formance of the added components due to technolog-

ical progress while automatically having advantages

in respect to data timeliness. At this point the costs

for any additional layer and its inherent latency being

introduced further slows down the system. This has

been the case for operational data stores as well as in-

creasingly for data warehouses, where ETL-process

run times are often no longer acceptable when analyt-

ical data is required in (near) real-time.

3 RELATED WORK

Several publications discussed the drawbacks of cur-

rent enterprise architectures, each proposing differ-

ent approaches to provide closed-loop functions and

timely analytics.

Golfarelli et al. presented an approach to busi-

ness intelligence (BI) called Business Performance

Management (BPM) (Golfarelli et al., 2004). BPM

is a framework including standardized components

as data warehouses as well as reactive components.

Such reactive components monitor time-critical busi-

ness processes so that operational decision support

can be steadily tuned and improved according to the

company strategy. Thus, it can be seen as a combina-

tion of Business Activity Monitoring (BAM) (Dres-

ner, 2003) with a full closed-loop approach. The ar-

chitecture proposed by Golfarelli et al. is depicted in

Figure 1. Especially the BAM component reveals the

high complexity of real-time data ﬂows in interleaved

separated systems where vertical as well as horizontal

data sources are integrated. Components as the user

interface have to handle two additional data streams

to enrich the data warehouse data with additional in-

formation.

Seufert et al. suggest an architecture with a cen-

tral Sense & Response (S&R) system (Seufert and

Schiefer, 2005). This S&R system communicates

with the internal and external business components

and improves business intelligence by linking busi-

ness processes with BI/DSS (Decision Support Sys-

tems) and reducing latencies. The approach of Seufert

et al. shows several problems concerning the com-

plexity of SEAs. Besides the fact that the EAI (Enter-

prise Application Integration) component is not well

integrated and lacks standardization for the analyt-

ical sources that are streaming data back to opera-

tional systems, the approach uses an operational data

store for “local closed-loops”. By using two differ-

ICEIS2014-16thInternationalConferenceonEnterpriseInformationSystems

412

User Interface

OLAP / reports

Dashboards

Alerts

BAM

Source Data Business Domain

metadata

repository

metadata

repository

DDS

other DBs

...

customer service

ﬁnancials

manufacturing

material

management

ETL

RTI

KPI Manager

rule engine

mining tools

Data

Warehouse

ERP

EAI

data streams

ODS

Figure 1: Architecture for Business Performance Management. Adapted with permission from (Golfarelli et al., 2004).

ent closed-loop approaches, the proposal of Seufert et

al. introduces additional complexity and redundancy

to the architecture. Furthermore, they propose the use

of an operational data store that is considered inept in

modern real-time systems (see Section 2.2).

Such complex systems have inherently sophisti-

cated communication designs. Another issue is the

absence of common communication channels, while a

highly interleaved and sophisticated linkage between

components is common. This results in an archi-

tecture in which communication channels potentially

differ between each component.

To sum up, it can be said that both discussed

approaches use additional components to evade

the drawbacks of currently deployed technologies;

adding redundancy and complexity to the architec-

tures.

4 AN INTEGRATED

ENTERPRISE DATA

MANAGEMENT

The technological conditions for database designs

changed starkly, both software and hardware wise.

Looking at the technical improvements in the ﬁeld of

databases – e.g. in-memory computing or highly scal-

able NoSQL databases – the question arises why busi-

ness environments are to a large extend untouched

by the latest achievements in hardware and software.

The separation of business concerns into transactional

and analytical tasks, using write- and read-optimized

databases respectively, is decades old and has never

changed signiﬁcantly.

Using an integrated data management approach

only one database is deployed which we assume to

have the following capabilities:

• A single database is deployed to store the entire

data of the enterprise.

• The database is able to handle transactional as

well as analytical workloads.

• Queries are processed directly on transactional

data without any pre-aggregated views or cubes,

allowing to answer analytical questions with suf-

ﬁcient performance.

Such a system does not persist any redundant data

and provides uniﬁed data models that cover the com-

plete enterprise system. Such uniﬁed data models for

all business entities covering all internal components

have the advantage of a notably simpliﬁed integration

of peripheral components. This advantage is of par-

ticular interest as an efﬁcient data integration is one

of the major bottlenecks in current systems in which

ETL-processes hinder analytical operations on cur-

rent data.

Eliminating redundancy provides additional ad-

vantages. Implementing functionality on an inte-

grated data management system means accessing a

single database using a single global set of data mod-

els. The data used across components is inherently

modeled consistently and can be directly joined. This

ﬁnally enables the goal of having a “single source of

truth” (Watson and Wixom, 2007). Furthermore, less

data redundancy means less application complexity

to keep different data sources consistent. Amongst

others, this results in lower costs for maintaining,

upgrading, updating and a faster return of invest-

ment when implementing new processes and func-

tions (Plattner, 2009).

AnIntegratedDataManagementforEnterpriseSystems

413

4.1 Adaptable Enterprise Architectures

However, combining all systems into an integrated

data management architecture is not sufﬁcient. Up-

coming demands are still easier to meet by adding

components – with copied data – instead of modi-

fying the underlying system. Here, a structured ap-

proach is required that incorporates the need of fre-

quent progression and adaption. With fast growing

companies of the new economy and quickly chang-

ing markets enterprises are required to be adaptable

and able to react to emerging trends or critical situa-

tions quickly. The past decade has shown that emerg-

ing startups can disrupt the businesses of multi-billion

dollar companies, forcing those companies to quickly

adjust to thwart customer loss. To handle these fast

changing surroundings and markets an enterprise has

be to adaptable in respect to its business processes and

its technical environment.

For many companies the development focus is on

sustaining mission critical systems (“never touch a

running system”) and implementing new functional-

ity by adding new components. Truex et. al in con-

trast argue that the main focus should not be on orga-

nizational stability and low maintenance, but rather on

“continuous analysis, negotiated requirements, and a

large portfolio of continuous maintenance activities”

(Truex et al., 1999). Here the system is expected to be

constantly evolving and improving and never reaches

a ﬁnal state of functional completeness. This is a

contradiction to the complexity and redundancy seen

in separated architectures as enterprises show very

different rates of changes. While the transactional

system with its business related importance changes

rather slowly, data warehouses evolved from batch-

updated systems to increasingly integrated systems

that are updated in real-time, sometimes even being

part of the transactional system itself. Here an ap-

proach is required that takes these different levels of

adaption into account to create an architecture that is

prepared for upcoming changes.

A concept which describes the structure of evolv-

ing objects in respect to the different functions each

component has is the shearing layer concept (Brand,

1994). The shearing layer concept originally de-

scribed six different layers of a building, which evolve

in different timescales. The fundamental layer, which

is the site of the building, may not change for several

generations. The next level – the building structure –

may change at rates between 30 and 300 years. The

last and most frequently changing layer is called stuff

that are parts such as furniture, which move or are ex-

changed daily to monthly.

The concept has been adopted in many areas, e.g.,

Database storing

Transactional Business Data

Session

Management

Innovative Products

Prototypes

Graph Engine

Machine Learning /

Data Mining Engine

Core Business Functionality

System of

Record

System of

Diﬀerentiation

System of

Innovation

Figure 2: Exemplary architecture adapting pace-layered ap-

plication strategy.

computer science (Simmonds and Ing, 2000) or hu-

man computer interaction (Papantoniou et al., 2003).

Simmonds and Ing discuss the shearing layer con-

cept in respect to information systems. They propose

agent and conversation constructs as means to dis-

cuss the functional design of systems that employ the

shearing layer concept.

A concept similar to the shearing layer concept

was proposed by Gartner, called the Pace-Layered

Application Strategy. This strategy differentiates

three different layers (Gartner, 2012): 1.) Systems of

Record - slowly changing layer that is highly stan-

dardized due to regulatory requirements and is usu-

ally mission critical, e.g. storing transactional data;

2.) Systems of Differentiation - for example business

functionality that is customized to the enterprise but

does not change steadily; and 3.) Systems of Innova-

tion - new applications and recent technologies that

are e.g. used to access new markets or requirements.

We use the idea of the shearing layer concept to

formulate a structured approach to identify different

rates of change of the system’s components and how

these can be orchestrated to allow a high adaptabil-

ity while providing a consistent and reliable (slowly

changing) foundation.

An exemplary architecture using Gartner’s pace-

layered notion is depicted in Figure 2. Here data and

functionality are placed in the corresponding layers

based on their rate of change and their relevance for

the business. The base layer is the database storing

transactional data that is highly relevant for all busi-

ness related functions and can be seen as the slowly

adapting foundation for the whole system. The base

layer forms the core of the integrated data manage-

ment.

On top of that, core business functionality that cre-

ates and processes transactional data is placed. This

ICEIS2014-16thInternationalConferenceonEnterpriseInformationSystems

414

layer is adapting faster than the actual data store.

Furthermore, fast changing or new components are

placed in this layer such as engines for graph com-

putation or data mining. Also additional data stores

that do not store business relevant data (e.g. caching

servers for websites or volatile data stores for session

management).

Functionality that is still in the testing phase, de-

mos, or prototypes are placed in the system of inno-

vation layer.

4.2 Feasible Applications

Besides reduced complexity and data redundancy, an

integrated data management also enables new appli-

cations. The following paragraphs discuss approaches

that are not possible – or only with high efforts – in

separated systems.

Forecasting. The growing forecasting capabilities

of enterprise systems are an important improvement

of current and upcoming enterprise systems. Fore-

casting is dominated by long running analytic-style

workloads. It is mostly processed on data warehouses

or analytical systems, therefore calculations are lim-

ited to the data available at execution time. Outliers

and abnormalities in the data can be an important fac-

tor for forecasts to be as precise as possible. This re-

quirement is not fulﬁlled by traditional analytical sys-

tems, whose data is limited to crucial subsets needed

for analytical reports and which is aggregated by pre-

deﬁned data integration methods. Thus forecasting

is often limited to long time views, because critical

short time forecasts are not possible due to the lack of

transactional (near) real-time data.

Techniques as data mining to ﬁnd new (or not yet

known nor precisely modeled) facts can be used in

combination with forecasts. The main beneﬁt here is

that all these services are not any longer bound to their

own data sets, but can easily interact with each other

since they are all running on the same data sets.

An example for such an application is the avail-

able to promise (ATP) check as presented in (Tin-

nefeld et al., 2011). Here, the ATP check is not only

run when an order arrives or resources have to be

reassigned, but also to check whether reassignments

might cause problems at a later point in time (e.g., the

production of too many items depends on the reliabil-

ity of a single contractor).

Event-driven Process Optimization. Besides

proactive tasks as forecasting, event-driven systems

can also be enhanced by improved reactive compo-

nents especially with the possibility of analytical

queries on transactional data. As Seufert et al. wrote

“Collecting and reconciling all operational data

related to business processes, enables the measure-

ment of process performance and helps to identify

opportunities for process improvement” (Seufert and

Schiefer, 2005). This advantage becomes subse-

quently possible because analytical insights, which

are usually obtained from warehouses via complex

OLAP-queries, are available directly in the source

system. It can be propagated immediately to any

component in the system.

5 MODERN DATABASE SYSTEMS

The ﬁeld of database management systems has been

widely disrupted in recent years. While traditional

RDBMS vendors have dominated the database market

for decades, in-memory databases for analytical ap-

plications as well as highly scalable NoSQL databases

show the potential of recent database technologies.

Besides database software, several hardware

achievements have greatly effected the design of cur-

rent database technologies. On the one hand, in-

creasing main memory capacities allow in-memory

database to keep the whole database in memory. In

many cases, in-memory databases show signiﬁcantly

improved performance over their disk-based coun-

terparts. On the other hand, current server systems

achieve high parallelism using multi-core CPUs or

many-node setups, providing both means to scale up

as well as to scale out.

These hardware improvements did not only in-

ﬂuence the development of new database designs,

but also allowed improvements of existing read-

optimized databases. While databases for trans-

actional workloads stagnate providing acceptable

performance for analytical workloads, analytical

databases are increasingly able to handle transactional

workloads (MacNicol and French, 2004; Krueger

et al., 2010a).

5.1 A “One size ﬁts all” Database

The question whether a single database can be ca-

pable of handling both transactional and analytical

workloads has been discussed in several publica-

tions, from a theoretical as well as from a practi-

cal point of view. Today, it is common understand-

ing amongst most researchers that no single database

can cover all scenarios and requirements for all appli-

cations (Stonebraker and C¸ etintemel, 2005; French,

1995). Conn states that no system could ever be

able to handle both workloads with the same data

AnIntegratedDataManagementforEnterpriseSystems

415

model (Conn, 2005). Also (Howard, 2002) and (Bar-

businski, 2002) conclude, that a common database

model can not cover the constraints and characteris-

tics of both - transactional and analytical - systems.

Whereby Howard and Conn do not negate the pos-

sibility that a single database engine might be capa-

ble of such a mixed workload using different storage

models.

Recent publications discuss the question, what

type of database engine could close the gap between

transactional and analytical workloads. This includes

topics as adopting columnar characteristics in row

stores (Bruno, 2009) as well as improving the write-

capabilities of column stores (Krueger et al., 2010a).

Also several hybrid approaches - combining row- and

column-oriented storage schemata - have been pro-

posed (Kemper and Neumann, 2011; Grund et al.,

2010). Database technologies as NoSQL databases

appear to be a good ﬁt for enterprise systems as they

are highly ﬂexible and scalable, but constraints on

transactionality hamper their adaption in productive

transactional systems. Whether non-ACID databases

can be efﬁciently used in business systems is not ob-

ject of this paper, but recent technologies such as

the RDBMS F1 show that even NoSQL pioneers as

Google reintegrate SQL and ACID compliance in

their mission critical systems (Shute et al., 2013).

Not proposing a “one size ﬁts all” solution, but

a solution particularly optimized towards enterprise

systems, has been proposed by Plattner (Plattner,

2009). This so-called common database approach has

been adopted in the commercial database SAP HANA,

a columnar in-memory database optimized for mixed

workloads (Sikka et al., 2012). Another in-memory

database is the H-Store-based VoltDB (Kallman et al.,

2008; Stonebraker and Weisberg, 2013). VoltDB is

row-based and optimized for OLTP workloads using

sharding and lock-free transactions. All major enter-

prise database vendors as IBM, Microsoft, and Oracle

are currently developing in-memory solutions on top

of their disk-based databases (Lindstroem et al., 2013;

Larson et al., 2013; Lahiri et al., 2013).

5.2 Databases for Integrated Data

Management Systems

Seeing the broad ﬁeld of databases raises the ques-

tion which database system might be able to han-

dle an integrated data management and provide the

foundation for a architecture as proposed in Section

4.1. As shown in Figure 2, such an architecture re-

lies on a database that stores all business relevant

data. This does not exclude a combination of multiple

data storages (i.e., polyglot persistence (Sadalage and

Fowler, 2012, pp. 133–140)) as there are situations

in which the addition of specialized storage engines

can be advantageous, e.g., for session management

or website caching purposes. But it is important to

understand that any business related data should be

stored only once and as detailed as possible. If pre-

aggregated views or additional technologies (e.g., Or-

acle’s in-memory layer (Lahiri et al., 2013) for ana-

lytical workloads or Microsofts’s Heckathon (Larson

et al., 2013) for transactional workloads) are used to

provide a sufﬁcient analytical performance, the views

should to be consistent with the primary transactional

data source (i.e., no interval-based updates) and han-

dled by the database only, not the application layer.

The data storage layer might consist of an array of

technologies and use redundancy to provide sufﬁcient

performance as long as the such performance addi-

tions are completely transparent to applications to re-

duce maintenance efforts and complexity. According

to Plattner it is already possible to deploy a single

database system for that matter (Plattner, 2009).

From a conceptual point of view technologies

as in-memory databases might be able to close the

gap between transactional and analytical systems.

Developments that put additional layers on top of

transactional data provide graph-computation capa-

bilities without added data redundancy (Rudolf et al.,

2013). The same has been shown for data mining

and machine learning (Grosse et al., 2013). Conse-

quently, we expect that relational databases using lat-

est achievements in the area of in-memory computing

will be able to handle transactional as well as analyti-

cal workloads with sufﬁcient performance.

6 APPLICABILITY FOR

EXISTING ENTERPRISE

ENVIRONMENTS

Discussing integrated data management based on the

shearing layer concept raises the question of how to

map the different layers to the current software and

hardware. Nowadays, most ERP systems use three-

tier architectures consisting of database(s), applica-

tion servers to provide scalability, and the front-end.

Amongst others, this setup has several shortcomings

in respect to performance and maintenance aspects to

keep the application servers consistent.

In contrast, architectures based on the shearing

layer concept can deploy a single DBMS to provide

scalability without complex application servers. We

propose to put both lower layers (i.e., the system of

record and the system of differentiation) on the lowest

ICEIS2014-16thInternationalConferenceonEnterpriseInformationSystems

416

hardware system, the database server. This way the

performance wisdom of bringing the code to the data

can be adhered. In contrast, most current architectures

using the three-tier architecture bring data to the code

to take computations off the database system. But as

presented in the previous Section, nowaday’s highly

parallel DMBSs are capable of handling much larger

workloads compared to the workloads and database

performance of the time when the three-tier architec-

ture was initially developed.

The proposed architecture shown in Section 4 can

be achieved using different means. Most current

database vendors favor stored procedures, which are

written in SQL-like languages and are directly exe-

cuted at the database level. The disadvantage of us-

ing SQL-like languages is that they are neither opti-

mized for the expression of complex business logic

nor aware of business objects used on the application

side. Besides SQL, there is an array of languages to

program databases. Haas et al. proposed a system

called Starburst to enable object aware programming

near the database: “We chose to stay with the rela-

tional model because of its great power and simplic-

ity, [...] give users the support they need for their ob-

jects through extensions, while preserving the advan-

tages of a relational DBMS” (Haas et al., 1989). A

similar approach has recently been proposed by Fr-

ber et al. (Faerber et al., 2011). Here, the database

is aware of application speciﬁc business objects –

called semantic models – to run business logic directly

on the database. Their architecture incorporates sev-

eral aspects of the system of differentiation layer into

the DBMS, e.g., a text analysis engine, a graph en-

gine, and a calculation engine for computation inten-

sive processes such as data mining. Furthermore, a

specialized library provides common business func-

tionality, which can be accessed by external applica-

tions and is optimized to execute directly inside the

database core. Such functionalities include, e.g., cur-

rency conversion or disaggregation as used in many

planning operations (Jaecksch et al., 2010).

An example of the possible performance improve-

ments of re-engineered business processes has been

shown by Tinnefeld et al. (Tinnefeld et al., 2011)

with the available-to-promise (ATP) check. The ATP

check has been implemented on a in-memory col-

umn store pushing down the business logic to the

database. Even though this process is usually is con-

sidered transactional, it is comparably complex and

includes long running queries (Krueger et al., 2010b).

This prototype shows that transactional processes can

be sped up signiﬁcantly by pushing down logic to the

database level.

However, adapting existing enterprise architec-

tures is challenging. While it is partially possible

to avoid rewriting code by using dynamic database

views (Plattner, 2013), the costs to adapt complete

systems are still rather high. Another challenge to

solve is the question how logic push-downs to the

database layer can be done using non-proprietary

standards to avoid potentially expensive vendor lock-

ins when adapting existing systems to vendor speciﬁc

languages and protocols.

7 CONCLUSION

The trend towards more complex systems has been

steady for years now, mainly caused by increasing re-

quirements for data timeliness. With a growing num-

ber of systems whose data ought to be integrated in

real-time, this trend appears likely to continue. The

complexity of current enterprise architectures and the

recurring pattern of introducing data-redundant com-

ponents to improve performance reveal the structural

problems of most enterprise architectures. In this pa-

per we proposed a two-fold approach using the shear-

ing layers concept to adapt the system architecture

and an integrated data management to simplify fur-

ther simplify the system. The shearing layers concept

allows the identiﬁcation of components with differ-

ent scales of change to build an architecture that is

adaptable and maintainable in the long run. Together

with the implications of an integrated data manage-

ment, our approach does not only enable new possi-

bilities but also lays the foundation for upcoming gen-

erations of enterprise systems. Seeing this approach

from a technical point of view in context with lat-

est research leads us to the conclusion that an inte-

grated data management for enterprise architectures

running on a single database system is feasible. Be-

cause technologies such as in-memory technologies

have already shown to have a vast impact on perfor-

mance without the shortcomings of separated enter-

prise systems.

REFERENCES

Barbusinski, L. (May 2002). Can you use one database

model for olap and oltp? DM Review.

Brand, S. (1994). How buildings learn. Viking, New York.

Bruno, N. (2009). Teaching an old elephant new tricks. In

CIDR. www.crdrdb.org.

Conn, S. (2005). Oltp and olap data integration: a review

of feasible implementation methods and architectures

for real time data analysis.

AnIntegratedDataManagementforEnterpriseSystems

417

Dresner, H. (2003). Business activity monitoring: Bam

architecture. Gartner Symposium ITXPO (Cannes,

France).

Faerber, F., Cha, S. K., Primsch, J., Bornhoevd, C., Sigg, S.,

and Lehner, W. (2011). Sap hana database: data man-

agement for modern business applications. SIGMOD

Record, 40(4):45–51.

French, C. D. (1995). ”one size ﬁts all” database architec-

tures do not work for dss. SIGMOD Rec., 24:449–450.

Gartner (2012). Pace-layered application strategy.

http://www.gartner.com/it-glossary/pace-layered-

application-strategy/. Accessed 2013-09-27.

Golfarelli, M., Rizzi, S., and Cella, I. (2004). Beyond data

warehousing: what’s next in business intelligence? In

DOLAP ’04: Proceedings of the 7th ACM interna-

tional workshop on Data warehousing and OLAP.

Grosse, P., Lehner, W., and May, N. (2013). Advanced ana-

lytics with the sap hana database. In DATA 2013.

Grund, M., Krueger, J., Plattner, H., Zeier, A., Cudre-

Mauroux, P., and Madden, S. (2010). Hyrise: a main

memory hybrid storage engine. Proc. VLDB Endow.,

4:105–116.

Haas, L. M., Freytag, J. C., Lohman, G. M., and Pirahesh,

H. (1989). Extensible query processing in starburst.

In SIGMOD Conference, pages 377–388. ACM Press.

Howard, S. (May 2002). Can you use one database model

for olap and oltp? DM Review.

Inmon, W. H. (1999). Building the Operational Data Store.

John Wiley & Sons, Inc., New York, NY, USA.

Jaecksch, B., Lehner, W., and Faerber, F. (2010). A plan

for olap. In EDBT, volume 426 of ACM International

Conference Proceeding Series, pages 681–686. ACM.

Kallman, R., Kimura, H., Natkins, J., Pavlo, A., Rasin, A.,

Zdonik, S. B., Jones, E. P. C., Madden, S., Stone-

braker, M., Zhang, Y., Hugg, J., and Abadi, D. J.

(2008). H-store: a high-performance, distributed

main memory transaction processing system. PVLDB,

1(2):1496–1499.

Kemper, A. and Neumann, T. (2011). Hyper: A hybrid

oltp&olap main memory database system based on

virtual memory snapshots. In ICDE, pages 195–206.

IEEE Computer Society.

Krueger, J., Grund, M., Tinnefeld, C., Plattner, H., Zeier,

A., and Faerber, F. (2010a). Optimizing write perfor-

mance for read optimized databases. In DASFAA (2),

volume 5982, pages 291–305. Springer.

Krueger, J., Tinnefeld, C., Grund, M., Zeier, A., and Plat-

tner, H. (2010b). A case for online mixed workload

processing. DBTest ’10, pages 8:1–8:6, New York,

NY, USA. ACM.

Lahiri, T., Neimat, M.-A., and Folkman, S. (2013). Ora-

cle timesten: An in-memory database for enterprise

applications. IEEE Data Eng. Bull., 36(2):6–13.

Larson, P.-A., Zwilling, M., and Farlee, K. (2013). The

hekaton memory-optimized oltp engine. IEEE Data

Eng. Bull., 36(2):34–40.

Lindstroem, J., Raatikka, V., Ruuth, J., Soini, P., and

Vakkila, K. (2013). Ibm soliddb: In-memory database

optimized for extreme speed and availability. IEEE

Data Eng. Bull., 36(2):14–20.

MacNicol, R. and French, B. (2004). Sybase iq multiplex -

designed for analytics. In Proceedings of the Thirtieth

international conference on Very large data bases -

Volume 30, VLDB ’04.

Mangisengi, O. and Huynh, N. T. (2008). Towards a closed-

loop business intelligence framework. In ICEIS, pages

210–217.

Papantoniou, B., Nathanael, D., and Marmaras, N. (2003).

Moving target: designing for evolving practice. In

HCI International 2003.

Plattner, H. (2009). A common database approach for oltp

and olap using an in-memory column database. In

SIGMOD Conference, pages 1–2. ACM.

Plattner, H. (2013). A Course in In-Memory Data Manage-

ment: The Inner Mechanics of In-Memory Databases.

Rudolf, M., Paradies, M., Bornh

ovd, C., and Lehner, W.

(2013). The Graph Story of the SAP HANA Database.

In BTW, pages 403–420.

Sadalage, P. and Fowler, M. (2012). NoSQL Distilled: A

Brief Guide to the Emerging World of Polyglot Persis-

tence. Addison Wesley Professional.

Seufert, A. and Schiefer, J. (2005). Enhanced business intel-

ligence - supporting business processes with real-time

business analytics. In DEXA Workshops, pages 919–

925. IEEE Computer Society.

Shute, J., Vingralek, R., Samwel, B., Handy, B., Whipkey,

C., Rollins, E., Oancea, M., Littleﬁeld, K., Menest-

rina, D., Ellner, S., Cieslewicz, J., Rae, I., Stancescu,

T., and Apte, H. (2013). F1: A distributed sql database

that scales. PVLDB, 6(11):1068–1079.

Sikka, V., F

arber, F., Lehner, W., Cha, S. K., Peh, T., and

Bornh

ovd, C. (2012). Efﬁcient transaction processing

in sap hana database: the end of a column store myth.

In SIGMOD Conference, pages 731–742. ACM.

Simmonds, I. and Ing, D. (2000). A shearing layers ap-

proach to information systems development. Techni-

cal Report RC 21694, IBM Research.

Stonebraker, M. and C¸ etintemel, U. (2005). ”one size ﬁts

all”: An idea whose time has come and gone. In Pro-

ceedings of the 21st International Conference on Data

Engineering, pages 2–11.

Stonebraker, M. and Weisberg, A. (2013). The voltdb main

memory dbms. IEEE Data Eng. Bull., 36(2):21–27.

Tinnefeld, C., Mueller, S., Zeier, A., and Plattner, H. (2011).

Available-to-promise on an in-memory column store.

In Datenbanksysteme in Business, Technologie und

Web (BTW 2011), Proceedings.

Truex, D. P., Baskerville, R., and Klein, H. K. (1999).

Growing systems in emergent organizations. Com-

mun. ACM, 42(8):117–123.

Watson, H. J. and Wixom, B. H. (2007). The current state

of business intelligence. Computer, 40(9):96–99.

ICEIS2014-16thInternationalConferenceonEnterpriseInformationSystems

418