MODEL-DRIVEN VIRTUALIZATION
OF HEALTHCARE RECORDS
Daniel Alexander Ford
Department of Computer Science, IBM Almaden Research Center, 650 Harry Road, San Jose, CA, U.S.A.
Keywords: Model-Driven Development, Healthcare record management, Distributed Systems, UML, Eclipse Modeling
Framework, EMF, Connected Data Objects, CDO.
Abstract: This paper describes an approach to solving scalability, distribution and accessibility problems associated
with large collections of healthcare records. The technique described in the paper explains how to exploit
Model-Driven Healthcare record implementations of the Health Level Seven (HL7) standards for Clinical
Documents (CDA) and Continuity of Care Documents (CCD), such as those developed by the Model-
Driven Healthcare Tools (MDHT) project to quickly and easily enable highly scalable, distributed
healthcare applications. The MDHT project has developed models in the Unified Modeling Language
(UML) of the HL7 standard that it uses to automatically generate high quality software implementations of
the standard. The technique described in the paper shows how to retarget this code generation process to
automatically create an alternative implementation using a technology called Connected Data Objects
(CDO). This new implementation immediately supports scalable, distributed access to document
collections. The paper then goes on to describe example applications made possible by the new capabilities
provided by the alternative implementation.
1 INTRODUCTION
The Model-Driven Healthcare Tools (MDHT)
project (Carlson, 2010) has the goal of producing
open source software models and implementations
of Health Level Seven (HL7) (Benson, 2009)
standards for electronic healthcare documents such
as of Clinical Document Architecture (CDA) and
Continuity of Care Documents (CCD). The software
modeling technologies used by the MDHT project
are those provided by the Eclipse Foundation as part
of its open source Eclipse Modeling Project.
Notably, they include the UML2 project and the
Eclipse Modeling Framework (EMF) (Steinberg,
2009).
The models created by the MDHT are
represented in the Unified Modeling Language
(UML) (Booch 2000) and can be automatically
processed by the code generation facilities of the
Eclipse Modeling Framework, to generate software
implementations of the document standards in the
Java™ programming language. The generated code
reflects best practices for software implementation
and is of a uniform high quality that is difficult to
produce manually in any consistent manner. This
Model-Driven approach to software development
creates implementations with few, if any, errors, and
which can be modified extensively and quickly by
regenerating after making changes to the original
model.
These generated implementations make the
writing of other healthcare applications that use
these types of documents much easier and much
simpler. The MDHT project produces Java™
implementations of the various documents that can
be directly manipulated by application code. The
generated code also has extensive capabilities such
as automatic XML serialization and deserialization,
reflective object access and notification services,
among many others. All of these features are directly
available to application code.
A key point is that the MDHT project uses the
EMF code generation capabilities to the maximum
extent possible and consequently, the
implementation contains no manually written code.
This means that it can be easily reconfigured and
automatically generated without the need to
manually reproduce some portion of the
implementation.
200
Alexander Ford D..
MODEL-DRIVEN VIRTUALIZATION OF HEALTHCARE RECORDS.
DOI: 10.5220/0003139002000204
In Proceedings of the International Conference on Health Informatics (HEALTHINF-2011), pages 200-204
ISBN: 978-989-8425-34-8
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)
A general limitation of any Java™ application,
generated, or not, is the finite amount of memory
available to the Java™ virtual machine in which it is
running. In the case of the MDHT project, this limits
the size and the number of instances of healthcare
documents that can be processed simultaneously in
one Java™ virtual machine. Depending on the
application, this may or may not be an issue, but it is
a general problem that limits the range and
scalability of potential applications.
This limitation also makes it more difficult for
networked applications to cooperate and share data,
as the object graph being manipulated on one Java™
virtual machine cannot extend to another Java™
virtual machine.
2 RETARGETING
FOR SCALABILITY
Solving the scalability problem typically involves
developing applications that make network access to
a centralized repository, typically a relational
database. Applications that take this approach tend
to include implementation details that query and
modify the database. These details take significant
implementation and maintenance effort by skilled
and experienced developers. Unfortunately, these
efforts are usually orthogonal to the goals of the
application itself.
The MDHT project is in the interesting position
of having an implementation that is completely
generated; this opens up the possibility of retargeting
the code generating process to produce an alternative
implementation that addresses the scalability. The
Eclipse Modeling Framework contains a sub-project
called Connected Data Objects (CDO) that enables
exactly that alternative.
2.1 Connected Data Objects (CDO)
The CDO project supports the concept of a virtual
object graph. This is a collection of Java™ objects
that are persisted in a network accessible backing
store, but which can be represented in multiple
Java™ virtual machines simultaneously. The graph
can be of an arbitrary number and size of objects.
With CDO, the entire graph appears to exist
simultaneously in the address space of any and all
Java™ virtual machines that can make the
appropriate network connection to the backing store.
Objects in the graph can be manipulated with regular
application logic with little reference to the existence
of the backing store other than appropriate
initialization and configuration.
CDO achieves this “magic” by configuring the
EMF’s code generation process to create Java™
classes that delegate all of their accesses (getters and
setters) to reflective methods implemented by the
internals of CDO. So, instead of generating
“regular” Java™ classes that contain references to
each other that are only valid within the context of a
single Java™ virtual machine, alternative versions
of the classes are generated which use abstract
“CDO Object Id’s” as references. Essentially, the
CDO Object Id serves as a virtual pointer that can
always be satisfied by the backing store. If a
referenced class is not in the address space of the
current Java™ virtual machine, CDO will retrieve it
from the backing store. Thus, given that no
particular generated class instance directly
references any other generated class instance, they
all can become candidates for regular Java™
garbage collection without the danger that the virtual
graph maintained by the backing store will become
disconnected or corrupted, this allows for arbitrarily
large collections of objects.
For the MDHT project, retargeting its generated
implementation to support a virtual object graph of
clinical documents is as simple as changing three
parameters controlling code generation. In fact, the
CDO tool set in the Eclipse IDE includes a simple
menu option to perform the necessary changes
almost instantly; code regeneration itself, takes just a
few seconds more to complete.
2.2 The CDO Server
The counterpart to the retargeted MDHT/CDO code
running in an application is something called the
CDO Server. Its job is to manage the backing store
that persists the virtual object graph, and to provide
network accessibility to the graph. The CDO Server
can use a number of different databases to persist the
virtual object graph, including MySQL, H2, and
Objectivity.
A CDO Server can manage multiple virtual
object graphs. It maintains a set of addressable
“repositories,” each of which can contain one or
more EMF “Resources.” A Resource is conceptually
much like a file and is identified by a unique “path”
that resembles a path in a hierarchical file system. A
virtual object graph can be persisted across more
than one Resource. There is a direct correspondence
between a repository and a virtual object graph.
The CDO Server is very sophisticated. It can
provide transactional semantics to maintain the
MODEL-DRIVEN VIRTUALIZATION OF HEALTHCARE RECORDS
201
integrity of a virtual object graph, and can provide
historical “audit” views of the graph as it existed at
any point in the past. It can also implement
disconnected semantics in which two or more
instances of a CDO Server can redundantly maintain
a virtual object graph such that in the event of
disruptions in communications, each can provide
access to the graph. The use cases for this ability
include local disconnected operation such as on a
portable computer, or for geographically distributed
servers that may experience communication
problems. In either case, the CDO server will
resolve differences between the two servers when
communications are restored.
2.3 CDO Semantics and Capabilities
To facilitate multiple accesses to a virtual object
graph, CDO provides transactional semantics and
notification facilities. These allow different
applications to modify documents in the graph and
transactionally commit them to the CDO Server.
Conflicting changes result in thrown exceptions for
all but the first application to commit a change.
Resolution of conflicting modifications is an
application specific detail. In addition to
transactional “write access,” CDO also offers two
other ways to access the contents of the virtual graph
in an application. The first is a “read-only”
representation of the graph that does not allow
commits to the CDO Server. The other is an “audit”
view that allows a read-only access to historic
versions of the graph as it existed at specified times
in the past.
The Eclipse Modeling Framework has extensive
“notification” capabilities that allow application
code to be notified of changes in instances of classes
generated by the EMF. As part of the EMF, CDO
also provides these capabilities, such that
applications can be notified of changes to the object
graph so that they can respond. For instance, an
application might wish to be informed of the
addition of new clinical documents to the graph so
that it can process them and generate a report.
2.4 Advantages of Virtualization
There are a number of advantages of retargeting the
MDHT implementation to leverage Connected Data
Objects. The first is that little or no changes are
necessary to application code written for the
conventional or “legacy” (in “CDO-speak”)
implementation. The regenerated code retains the
same Java™ interfaces as before, only the
implementations of the interfaces are changed by
CDO. This means that applications developed for
the legacy version are often unaware that anything
has changed and do not need to include complex
code to manage a relationship with a relational
database; they become “scalable” almost for “free.”
The virtualization of MDHT also enables an
entirely new set of distributed healthcare record
processing applications and architectures. With a
network accessible virtual graph of clinical
documents, one can begin to develop applications
that leverage this arrangement and use the graph as a
common data structure on which to cooperate and
coordinate their activities. For instance, some
applications might be producers of clinical
documents, while others might be consumers of
some sort; some might be both by transforming or
augmenting clinical documents for further
processing by others. Since CDO supports multiple
virtual graphs, it is easy to partition documents as
necessary. The decoupling provided by CDO allows
these types of applications to be developed
independently and operate autonomously.
2.5 Virtualization Trade-offs
As with all things, there are trade-offs to leveraging
a technology such as CDO to enable the
virtualization of healthcare records. The first trade-
off is performance. While, CDO is not necessarily
slow and it does cache requested data, it still does
need to resolve some object accesses via by
connecting through a network to a CDO Server. This
will always be slower than accessing clinical
documents in a non-virtualized object graph.
Further, there is extra complexity in reconfiguring
the legacy MDHT model code and retargeting code
generation for CDO. This process essentially creates
two versions of the implementation, one legacy and
one virtual; creating a software maintenance chore to
keep them synchronized if both are desired. It is not
difficult to do this, but it is a task that did not exist
before.
The alternative approaches to creating a
framework for a distributed scalable healthcare
record infrastructure would seem to face similar
issues, without necessarily having the advantage of
automatic code generation. The transparency and
lack of implementation required of an application
developer using the generated MDHT/CDO
implementation should drastically and dramatically
shorten any development time when compared to
alternatives that would require a developer to create,
implement, debug, and maintain, their own
HEALTHINF 2011 - International Conference on Health Informatics
202
distributed infrastructure. Trying to create an
alternative the provided the capabilities offered by
CDO would seem to require equivalent effort for a
reimplementation of CDO.
3 EXAMPLE APPLICATONS
One of the first example implementations we
experimented with was the distributed editing of
clinical documents on multiple networked machines.
The configuration was fairly simple and consisted of
a CDO Server configured with an H2 database and a
simple Eclipse based client that used the reflective
EMF editor. This editor uses the metadata of the
MDHT models (i.e., the names of the classes, etc.)
which is embodied in special editor accessible
classes created during code generation, to allow
users to visually edit instances of a model, in our
case, instances of Clinical Documents (CDA’s).
Our initial tests uses another simple plain Java™
“loader” application that used the MDHT’s
generated XML deserialization abilities to load
sample clinical documents stored on the local file
system. These were then added to a CDO Resource
maintained obtained from the CDO Server, and then
committed.
With the Resource preloaded with example
clinical documents, we then ran the client
applications on three different networked computers,
on which we opened identical CDO “transactional
sessions” with the CDO Server. Within the context
of the session we then were able to open the
Resource in the editor and view the same collection
of clinical documents on each of the three machines.
Selecting a portion of a document in the editor
revealed the “properties” of that item in the Eclipse
Properties View, where it could be modified. Saving
the contents of the file triggered notifications that
resulted in the changes being committed to the CDO
Server. This caused the server to notify the other
open sessions of the changes, triggering them to
update their contents.
This simple example provided real-time
distributed editing of clinical healthcare records with
transactional semantics. The effort required to create
the example was trivial in comparison to that
required to manually produce such an
implementation.
3.1 Distributed Healthcare
Applications
The ability for an application to leverage the ability
of the MDHT/CDO code base to access a virtual
graph of interconnected healthcare records without
significant implementation effort enables a new set
of application domains that were previously difficult
or too expensive to produce. It also significantly
alters the architectures used to develop such
applications as it becomes possible to independently
implement and distribute various aspects of
healthcare record processing by leveraging the loose
coupling provided by the virtual object graph of
healthcare documents.
One can envision network connected medical
devices that operate as “first class citizens” in the
healthcare record “universe.” These devices would
use MDHT/CDO to produce fully formed healthcare
documents which would then be added to the
appropriate CDO Server provided Resource which
would persist them and notify others of their
existence.
Extending this architectural concept further, we
can imagine a suite of applications that “tap into”
various virtual collections of healthcare records.
Some, like medical devices, would be a source of
such documents, while others such as nurse’s
stations or other user oriented applications would
tend to be consumers. Other types of applications
would monitor and transform, for instance,
aggregating the outputs of all of the medical devices
and logging their contents, creating summary reports
for nurse’s stations, or performing sophisticated
semantic analysis to look for trends or anomalies.
The key point is that all of the devices and
applications would use the same uniform
representation of healthcare records and the same
transparent technique to access their collections.
Interchangeable interactions between existing and
future applications would involve little difficulty.
4 CONCLUSIONS
The appearance of Model-Driven implementations
of healthcare record standards, such as those
produced by the MDHT project, has created new
opportunities to easily solve scalability problems
and to create new application architectures. The
MDHT’s use of the Eclipse Modeling Framework
allows the generated implementation to be retargeted
to use another EMF technology called Connected
Data Objects. This technology allows the creation of
MODEL-DRIVEN VIRTUALIZATION OF HEALTHCARE RECORDS
203
virtual object graphs that allow applications to
access and maintain collections of healthcare records
of arbitrary size. These collections can be accessed,
with transactional semantics, by multiple distributed
applications simultaneously.
The effort required to achieve this ability is a
trivial reconfiguration of the MDHT code generation
parameters. This produces an alternative
implementation of the HL7 standard documents that
can be used in virtualized collections with no
modifications.
The virtualization of healthcare records enables
new types of applications and application
architectures. These can be created with minimal
effort in comparison to alternative manual
implementations.
REFERENCES
Carlson, D., https://mdht.projects.openhealthtools.org/
Benson, T., Principles of Health Interoperability HL7 and
SNOMED (Health Informatics), Springer, 2009.
Steinberg D., Budinsky, F., Paternostro, M., Merks, E.,
2009. EMF Eclipse Modeling FrameworkPaper
templates. 2
nd
Edition. Addison Wesley.
Booch, G., Jacobson, I., Rumbaugh J., OMG Unified
Model Specification, Version 1.3 First Edition: March
2000.
HEALTHINF 2011 - International Conference on Health Informatics
204