d2d — A ROBUST FRONT-END FOR PROTOTYPING, AUTHORING
AND MAINTAINING XML ENCODED DOCUMENTS
BY DOMAIN EXPERTS
Markus Lepper
1
and Baltasar Tranc
´
on y Widemann
1,2
1
<semantics/> GmbH, Berlin, Germany
2
Universit
¨
at Bayreuth, Bayreuth, Germany
Keywords:
Knowledge acquisition, Semi-formal documents, Domain-specific languages.
Abstract:
In many cases, domain experts are used to write down their knowledge in contiguous texts. A standard way
to facilitate the automated processing of such texts is to add mark-up, for which the family of XML-based
standards is current best practice. But the default textual appearance of XML mark-up is not suited to be
typed, read and edited by humans. The authors’ d2d notation provides an alternative which uses only one
single escape character. Its documents can be fluently typed, understood and edited by humans almost in the
same way as non-tagged text. In the last years, the d2d language underwent a development guided by practical
experiences. In practice, robustness turned out to be highly desirable: This lead to revised semantics and a new
algorithm which realizes a total translation function, This article gives the complete operational semantics of
this algorithm after a short sketch of its context.
1 THE d2d APPROACH
1.1 Design Principles
Modeling knowledge by XML-encoded documents
is a rapidly expanding practice. XML seems to be
esp. useful for semi-structured documents, but also
as a representation of formal structures like object-
oriented databases, technical configuration data, etc.
New document types, suited for special needs, can
easily be defined, importing and combining existing
standards. However, the standard appearance of XML
with its large number of reserved characters and com-
plicated lexical rules is not suited to be directly typed,
read and edited by domain experts, esp. when they are
used to or dependent upon legacy production environ-
ments.
The authors’ d2d XML notation provides an alter-
native front-end representation which uses only one
single escape character. Documents can fluently be
typed, read and edited by humans almost in the same
way as non-tagged text. At the same time they rep-
resent an exactly defined XML document model, pro-
cessable by computers. The first versions of d2d had
been developed in 2001, (Lepper et al., 2001) and it
has been successfully employed in widely varying ap-
plications. The complete documentation of the cur-
rent version is in (Tranc
´
on y Widemann and Lepper,
2010). In d2d-encoded text there is explicit tagging,
where tags are marked with a single user-definable
character, and closing tags are inferred wherever pos-
sible.
One of the practical projects was the technical
documentation for a mid-scale software project. Ta-
ble 1 shows the number of single unicode characters
required in .d2d and standard .xml encoding. While
the 29% of key presses could of course also be elim-
inated by a syntax-controlled editor, the redundancy
remains distracting when reading.
Table 1: Characters in xml and d2d encoded texts.
lines words characters
12477 61086 456711 total “*.d2d
16273 61183 589377 total “*.xml
1.30 1.00 1,29 factor
A document type definition is required to rule the
parsing process as well as the structure of the gener-
ated XML output. The current tool implementation
understands the W3C XML DTD (Bray et al., 2006),
and some dedicated data format languages. Addition-
ally, d2d has its own definition format .ddf”with full
449
Lepper M. and Trancón y Widemann B..
d2d — A ROBUST FRONT-END FOR PROTOTYPING, AUTHORING AND MAINTAINING XML ENCODED DOCUMENTS BY DOMAIN EXPERTS.
DOI: 10.5220/0003664904490456
In Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD-2011), pages 449-456
ISBN: 978-989-8425-80-5
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)
control over character-level parsing. The following
text works with an abstract representation of the doc-
ument type definition, independent of its origin. The
first lines in the fragments from Figure 6 can give an
impression of the d2d input and the resulting XML
document.
1.2 Incremental Specifications,
Incomplete Documents and
Robustness
In practice, two features turned out to be of high value
and have been considered in the latest revision of d2d:
Firstly, the syntax of d2d has been carefully re-
vised for a more convenient support of incremental
refinement of document type definitions. Introduc-
ing new child elements, changing an element’s con-
tent from empty to optional children, from optional to
required or vice versa; all these operations require a
certain robustness and redundancy, esp. in tokeniza-
tion.
Secondly, the treatment of incomplete documents
turned out to be highly desirable: The new seman-
tics and the new algorithm realize a total translation
function, which recovers from ilelgal input as early as
possible. This allows convenient diagnostics in case
of erroneously incomplete documents, as well as a
well-defined way of creating incomplete documents
intentionally as preliminary versions.
To his end, the user may simply leave out required
child elements of a content model, or mark elements
as semantically incomplete by a special “brute-force”
closing tag, even if they are syntactically complete.
Both cases result in special meta-tags which are in-
serted in the generated XML model and should affect
further processing. In case of error, input may be not
only be missing, but also the opposite case, superflu-
ous tags or character data which cannot be parsed ac-
cording to the current content model. This kind of
input is transferred verbatim to the generated output,
wrapped in a meta-element.
In all these cases, the parsing algorithm tries to
resume work as soon as possible, maximizing the di-
agnostic output in one single run. The input syntax
and the complete operational semantics of this new,
robust parsing algorithm are subject of this paper.
2 THE PARSING ALGORITHM
There are three distinct layers of parsing in the d2d
architecture: Tokenization, tag-based parsing, and di-
rect character-based parsing for user-defined embed-
ded syntax. This text focusses on the second layer.
Explicit tagging and the LL(1) criterion for gram-
mar determinism can easily be taught to domain
experts without training in formal language theory.
Since the design of the formal and semi-formal doc-
ument structures for a new project or a certain pro-
duction context will happen in mixed teams of com-
puter scientists and domain experts, this is the ade-
quate level for communication.
2.1 Tokenization
As preprocessing to tag-based parsing, tokenization
effectively is the identification of the different kinds
of tags and of character data. Tokenization is de-
scribed informally in Figure 2. Its result is an instance
of the data type D from Figure 1.
Basically, all tags start with the command lead-
in character. This can be re-defined by the user, and
defaults to #”. It is followed by an identifier. Like
in XML, a leading slash / indicates a closing tag,
a trailing slash an empty element. For situations in
which a closing tag cannot be inferred, d2d supports
a short-hand notation, similar to the \verb κ...κ
construct known from L
A
T
E
X, abbreviating the clos-
ing tag to the single character κ. Additionally there
are forms with triple slashes, which are “brute-force”
closing and empty tags. Using them indicates that the
contents of the element are intentionally left incom-
plete.
2.2 Content Model Declarations
For the purpose of this article we simply put all tag
identifiers into one global name space. Then every
element is typed by a simple identifier as its tag. Each
such identifier used as a tag is mapped to one content
model. A content is an extended regular expression T
from Figure 1.
The meaning is fairly standard: Any ident stands
for an element with that identifier as its tag. #empty
stands for empty content. #chars stands for charac-
ter data. In contrast to WC DTD and other formats,
we have full compositionality of all operators. The
three unary operators ?”, + and * stand for op-
tional, repeated and optional-repeated occurrence, as
usual. The three binary operators “,”, |”, “& mean
sequence, alternative and permutation. Note that, in
contrast to Relax-NG (Clark and Murata, 2008), the
operator & stands for permutation of its contiguous
sub-terms, not for interleaving.
As mentioned above, the following description
works on an instance of the abstract data type T
D
of
KEOD 2011 - International Conference on Knowledge Engineering and Ontology Development
450
Extended regular expressions for content model declaration:
T ::= ident | #empty | #chars | T ,T | T |T | T &T | T ? | T + | T *
Document type definition:
T
D
::= ident 9 T
Input data after tokenization:
Token ::= chars(α) | OPEN
ident
| CLOSE
ident
| CLOSE
F
ident
| EMPTY
ident
| EMPTY
F
ident
D ::= Token
a #eof
Tree of nodes, generated as output:
N ::= chars(α) | node(ident,N
) | perm(T × (T 9 N
) )
| missing(T ) | skipped(Token × α)
Figure 1: Basic Data Types.
Assume
the tag lead-in character remains set to “#
stands for any whitespace character (blank, tab, newline, etc)
π for an opening parenthesis from the list
(”, “<”, “[”, “!”, etc.,
and π
0
for the corresponding closing parenthesis from the list
)”, “>”, “]”, “!”, etc.,
and κ for any input character which neither # nor π
0
Then the following transformations describe the tokenization process in a semi-formal way, if successively
applied to the head of the character input stream, with decreasing priority:
#(# | ) #
#TAG OPEN
TAG
#TAGπ OPEN
TAG
Additionally, π
0
CLOSE
TAG
is pushed to the parentheses context.
π CLOSE
TAG
when π
0
CLOSE
TAG
is currently in the parentheses context.
Additionally, this assignment is popped off the context stack.
π chars(π
0
)
when no assignment π
0
is in the parentheses context
#TAG OPEN
TAG
#/ CLOSE
i
where OPEN
i
is the most recently recognized open tag.
#/TAG CLOSE
TAG
#///TAG CLOSE
F
TAG
#TAG/// EMPTY
F
TAG
#TAG/ EMPTY
TAG
κ
chars(κ
)
Figure 2: Tokenization and Tag Recognition.
document type definitions. In the concrete implemen-
tation, this can result directly from a module of d2ds
own type definition format .ddf. But when using a
W3C DTD, it is the result of a transformation: For in-
d2d — A ROBUST FRONT-END FOR PROTOTYPING, AUTHORING AND MAINTAINING XML ENCODED
DOCUMENTS BY DOMAIN EXPERTS
451
epsilon : T {false,true}
ident
C
= ident {#chars}
first : T ident
C
epsilon(i : ident) = false
epsilon(#chars) = epsilon(#empty)
= epsilon(T ?) = epsilon(T *) = true
epsilon(T +) = epsilon(T )
epsilon(T
1
,T
2
) = epsilon(T
1
& T
2
)
= epsilon(T
1
) epsilon(T
2
)
epsilon(T
1
| T
2
) = epsilon(T
1
) epsilon(T
2
)
first(x : ident
C
) = x
first(#empty) = {}
first(T ?) = first(T +) = first(T *) = first(T )
first(T
1
,T
2
) =
(
first(T
1
) first(T
2
) if epsilon(T
1
)
first(T
1
) otherwise
first(T
1
| T
2
) = first(T
1
& T
2
) = first(T
1
) first(T
2
)
Figure 3: Auxiliary Functions.
stance,
<!ELEMENT e (#PCDATA | c)*)>
<!ATTLIST e a1 NMTOKEN #REQUIRED
a2 NMTOKEN #IMPLIED >
will be translated to
e = (a1 & a2?), (#chars | c)*
2.3 Parsing Process
After tokenization, the input to the parsing process is
of type D from Figure 1. All character data is treated
as if tagged with a dedicated, reserved and invisible
tag #chars. ident
C
is the set containing this tag and
all explicit, visible tags.
The output of a parsing process is a finite tree
according to the type definition of N from Fig-
ure 1. A term of type chars(α) is a leaf node car-
rying a contiguous sequence of character data, a term
node(ident, N
) represents an element of the gener-
ated XML model with its name and its child nodes,
and the terms of type perm(t,{. ..}) are required be-
cause the child nodes corresponding to a permutation
expression t shall later be shipped out in the normal-
ized, sequential order of declaration.
A term of type missing is attributed with an ex-
pression from T . The node must be replaced with
some input corresponding to the annotation in order
complete the document. A node of type skipped con-
tains tags which could not be accepted, together with
the immediately following character data. If no er-
rors have been encountered, the result does not con-
tain nodes of these both kinds.
Content model declarations in d2d follow the
LL(1) discipline, in a more strict sense than standard
XML DTDs. Therefore the parsing process is eas-
ily directed by “first sets”, which are well-known in
parser construction (Aho et al., 1986) and calculated
as in Figure 3.
The parsing process is specified by the total func-
tion translate from Figure 5. It operates on the input
D and a stack of frames F. Every stack frame from
F represents a future choice point, and holds both
the regular expression it is parsing and the interme-
diate, accumulated parsing result. For readability we
assume that the document type definition dt : T
D
is
globally accessible. The top level conversion function
text2tree is invoked with the tokenized input stream
and the tag of the root element.
Due to the LL(1) discipline, a step of the parsing
algorithm is determined completely by the head of the
tokenized input stream and the state of the stack:
descend is called when an open tag is to be con-
sumed, and this tag is contained in the set first of
the currently parsed expression. The stack will
grow by descending into this expression.
ascend o is called for an open tag not contained in
first of the current expression. The stack is un-
wound up to the innermost frame where the tag
can be consumed instead.
ascend c is called for an explicit close tag. The stack
is unwound in a similar manner. In the latter two
functions, missing tree nodes are generated for
tags which should be present for a valid docu-
ment.
skip If, in all these cases, no continuation can be
found, then the current tag (together with imme-
diately following character data) is rejected and
the corresponding error elements inserted into the
generated output.
Since one of these transformations will be pos-
sible for any input situation, the top-level functions
text2tree and translate are always total functions. It
depends on the concrete tool implementation what
to do with the embedded, error-indicating meta-
elements. Esp. the presence of “brute-force” clos-
ing tags should affect diagnostics, forcing tools into
“incomplete” mode, even though they act as ordinary
closing tags from the parser’s perspective.
KEOD 2011 - International Conference on Knowledge Engineering and Ontology Development
452
F : T × N
ascend o,ascend c : ident
C
× F
× N
F
// calculate a new stack, truncated as far as required, for consuming an open/close tag
// ν : N
are the result nodes accumulated in previous translation steps.
// τ : N
are the result nodes collected and created during ascend.
// φ : F
is the upper part of the stack, i.e. all frames created earlier when descending.
ascend o (i,( j : ident,ν) a φ, τ) = ascend o (i,φ, hnode( j,ν a τ)i)
ascend o (i,(r = t
/+
,ν) a φ, τ) =
(
(r, ν a τ) a φ if i first(t)
ascend o (i,φ, ν a τ) otherwise
ascend o (i, ((t
1
,t
2
,. ..,t
n
),ν) a φ, τ)
=
((t
2
,. ..,t
n
),ν a τ) a φ if i first(t
1
)
ascend o (i, ((t
2
,. ..,t
n
),ν) a φ, τ
0
) if i 6∈ first(t
1
) n > 1
ascend o (i,φ, ν a τ
0
) if i 6∈ first(t
1
) n = 1
where τ
0
=
(
τ if epsilon(t
1
)
τ a missing(t
1
) otherwise
ascend o (i, (t
x
,perm(t = (t
1
& . ..&t
n
),M = {t
k
1
7→ ν
k
1
,. ..,t
k
m
7→ ν
k
m
})) a φ,τ)
=
(
(t
y
,perm(t,M
0
)) a φ if y i first(t
y
) t
y
{t
1
,. ..,t
n
} t
y
6∈ {t
k
1
,. ..,t
k
m
}
ascend o (i,φ, perm(t, M
00
)) otherwise
where M
0
= M {t
x
7→ τ} M
00
= M {t
x
7→ τ a µ}
µ = hmissing(t
z
) | t
z
{t
1
,. ..,t
n
} t
z
6∈ {t
k
1
,. ..,t
k
m
} ¬epsilon(t
z
)i
ascend c (i, ( j : ident,ν) a φ,τ) =
(
ascend c(i,φ, node( j,ν a τ)) if i 6= j
(t
x
,µ a node( j,ν a τ)) a φ otherwise
where φ = (t
x
,µ) a φ
0
ascend c (i,(r = t
/+
,ν) a φ, τ) = ascend c (i, φ,ν a τ)
ascend c (i,((t
1
,t
2
,. ..,t
n
),ν) a φ, τ)
= ascend c (i, φ,ν a τ a hmissing(t
z
) | t
z
{t
1
,. ..,t
n
} ¬epsilon(t
z
)i)
ascend c (i,(t
x
,perm(t = (t
1
& . ..&t
n
),M = {t
k
1
7→ ν
k
1
,. ..,t
k
m
7→ ν
k
m
}),ST
PERM
) a φ,τ)
= ascend c (i, φ,perm(t, M
00
))
where M
00
= M {t
x
7→ τ a µ}
where µ = hmissing(t
z
) | t
z
{t
1
,. ..,t
n
} t
z
6∈ {t
k
1
,. ..,t
k
m
} ¬epsilon(t
z
)i
ascend o ( , hi, ) = ascend c ( ,hi, ) = hi
descend : T × ident
C
F
// Precondition: descend(t, i, ) is only called when i first(t)
descend((t
1
,t
2
,. ..,t
n
),i) =
(
descend(t
1
,i) a ( (t
2
,. ..t
n
),hi) if i first(t
1
)
descend((t
2
,. ..t
n
),i) otherwise
descend((t
1
| ... | t
n
),i) = descend(t
k
,i)
where 1 k n i first(t
k
)
descend(t = (t
1
& . .. & t
n
),i) = descend(t
k
,i) a (t
k
,perm(t,{}),ST
PERM
)
where 1 k n i first(t
k
)
descend(t?,i) = descend(t,i)
descend(r = t
/+
) = descend(t, i) a h(r, hi)i
descend(i,i) = h(dt(i),hi)i if i 6= #chars
descend(#chars,#chars) = hi
Figure 4: Extending and Truncating the Stack during Tag Parsing.
d2d — A ROBUST FRONT-END FOR PROTOTYPING, AUTHORING AND MAINTAINING XML ENCODED
DOCUMENTS BY DOMAIN EXPERTS
453
skip,skip
C
: D × F
D × F
skip
C
(d a δ, ( ,τ) a φ) ==
(
skip
C
(δ,( ,τ a skipped(d)) a φ) if d = chars( )
(d a δ, ( ,τ) a φ) otherwise
skip(d a δ, ( ,τ, ) a φ) == skip
C
(δ,( ,τ a skipped(d), ) a φ)
// as an alternative call “id()” instead of skip
C
(), see section 3
translate : D × F
D × F
translate(CLOSE
j
a δ,φ) =
(
translate(δ,φ
0
) if φ
0
6= hi
translate(skip(CLOSE
j
a δ,φ)) otherwise
where φ
0
= ascend c ( j,φ,hi)
translate(OPEN
j
a δ,φ) =
translate(δ,descend(t, j) a φ)) if j first(t)
translate(δ,descend(t, j) a φ
0
) if j 6∈ first(t) φ
0
6= hi
translate(skip(OPEN
j
a δ,φ)) otherwise
where φ = (t, , ) a
where φ
0
= ascend o( j,φ,hi)
translate( = chars(α) a δ,Φ = (t, ν) a φ)
=
translate(δ,(t,ν a chars(α)) a φ) if #chars first(t)
translate(,φ
0
) if #chars 6∈ first(t) φ
0
6= hi
translate(skip
C
(,Φ)) otherwise
where φ
0
= ascend o(#chars, Φ,hi)
text2tree : D × ident N
text2tree(d,i) = node(i,ν)
where translate(h(dt(i),hi)i) = h#eofi,h( , ν)i
Figure 5: Algorithm for Tag Parsing.
2.4 Shipping out the Node Tree as XML
Structure
The ship-out of the internal model N to standard
XML textual representation requires further, but mi-
nor, transformations: All nodes represented as XML
attributes are treated separately, and all nodes which
have been matched by a content model of permuta-
tion type are written out in the sequential order of that
definition. All other nodes are converted to XML el-
ements or text nodes in the order they appear in the
d2d input document.
2.5 Error Messages after XHTML
Transformation
The result of translating an erroneous input is a tree
containing error nodes, i.e. missing and/or skipped
nodes. When shipping out, these are translated into
XML elements with dedicated tags from a reserved
name space.
The d2d library of pre-defined general-purpose
document types comes with a collection of translation
rules from XML into various back-ends (currently:
XHTML 1.0 and partly L
A
T
E
X), realized as XSLT 1.0
templates. In this context also the error meta-elements
are translated, i.e. “rendered”. An eye-catching style
has been chosen for them, similar to the “raspberry
red” for parenthesis mismatch used in XEmacs. Fig-
ure 6 demonstrates all these formats for a fragment
containing correct and erroneous input.
3 FUTURE WORK
The treatment of errors is, of course, heuristic. In our
practical experience, In 80 percent of all cases a sen-
sible continuation is found. But of course variants and
extensions are possible:
1. In the current implementation, after a spurious tag
has been skipped, all subsequent character data is
also discarded. The alternative is indicated in the
KEOD 2011 - International Conference on Knowledge Engineering and Ontology Development
454
Document Type Definition, in .ddf format:
module structure
...
tags h1 = title, label?, (p | px | P | DOMAIN_SPECIFIC_VERTICAL_ELEMENTS)*, h2*
tags h2 = title, label?, (p | px | P | DOMAIN_SPECIFIC_VERTICAL_ELEMENTS)*, h3*
tags title = (#chars | DOMAIN_SPECIFIC_HORIZONTAL_ELEMENTS)*
tags DOMAIN_SPECIFIC_VERTICAL_ELEMENTS,
DOMAIN_SPECIFIC_HORIZONTAL_ELEMENTS = #generic
...
end module
Erroneous Input:
#p In the last years, #mt have been successfully employed in very different
medium-scale industrial, private and administrative applications.
// =========================================
#h1 #titel Components of #mt
// =========================================
#p
The characters of the components of #mt range from small utility libraries,
which can be used ubiquituously, upto large source code generating systems.
#p
XML Parsing Result:
<p>In the last years, <metatools/> have been successfully employed in very different
medium-scale industrial, private and administrative applications.</p>
<h1>
<d2d:parsingError kind="open" location="file.d2d:46:0->56:11" tag="titel"/>
<d2d:skipped>Components of #</d2d:skipped>
</d2d:parsingError>
<d2d:parsingError d2d:kind=’open’ location=’..’ tag=’mt’>
<d2d:skipped>// =========================================#</d2d:skipped>
</d2d:parsingError>
<d2d:parsingError d2d:kind=’open’ location=’..’ tag=’p’>
<d2d:expected>(title)</d2d:expected>
</d2d:parsingError>
<p>The characters of the components of <mt /> range from small utility libraries,
which can be used ubiquituously, upto large source code generating systems.
</p>
XHTML Rendering:
Figure 6: Error Messaging in the XHTML Back-End.
comment to the skip function in Figure 5.
2. The current version of the algorithm only looks
“forward” and assumes that required elements
have been omitted as a whole. What the current
algorithm does not do is to look “into depth” and
test whether simply an opening tag has been for-
gotten. But this analysis is not trivial, because
the LL(1) discipline no longer applies, and more
look-ahead or back-tracking is necessary.
3. The point when the skip transformation occurs
is exactly the place where fuzzy matching of tag
identifiers could be useful for detecting small ty-
d2d — A ROBUST FRONT-END FOR PROTOTYPING, AUTHORING AND MAINTAINING XML ENCODED
DOCUMENTS BY DOMAIN EXPERTS
455
pos.
4. The source text position of an error is included in
the generated error message in a standard way. It
also could be included in the XHTML rendering,
for instance as a “tool-tip text”.
5. In certain production contexts it could be useful
to generate an extended version of the original
d2d source document which is enriched with eye-
catching comments for skipped input as well as
for missing elements. The latter could include the
synthesized regular expression which describes
the missing contents, or even hyperlinks to the on-
line documentation, as a diagnostic aid.
4 RELATED WORK
W.r.t. the declaration of content models, there are
strong similarities with relax-ng (Clark and Murata,
2008). But in detail there are significant differences:
their &”-operator means interleaving, ours permuta-
tion; the mechanism for parameterization is different,
and, last but not least, relax-ng is restricted to verifi-
cation of existing documents and does not deal with
human-friendly notation at all.
In the field of document construction tools we
have found hardly any similar approach to d2d. Some
similarities can be found to M4 (m4, 2000). But these
are restricted to the simplicity of the textual represen-
tation in general, and rather different to our syntax
in detail. M4 has, e.g., also a single-letter qualifier
for macro names: It uses the open parenthesis as suf-
fix, while we use a user-defined character as prefix.
It would be a very different, but also very interest-
ing project, trying to use M4 directly for authoring
XML document generation, but this has not been un-
dertaken as far as we know.
The principles of the “\verb k ... k” construct and
of the \begin{verbatim} construct are taken di-
rectly from L
A
T
E
X (Lamport, 1986). Since grammar
and semantics are totally different from our approach,
this is currently the only similarity. We are think-
ing about implementing a kind of “macro expansion”
mechanism, allowing the user to defined small ad-hoc
convenience abbreviations. In this context, T
E
X could
perhaps again serve as a model.
A similar degree of neighborhood can be seen to
Lout, (Kingston, 1992), (Kingston, 2000), but again
only in the syntax, employing a single active char-
acter, here the @”. Lout is intended as a replace-
ment for L
A
T
E
X, i.e. it ultimately acts as a type-setting
system for directly generating postscript documents.
A semantic text structure can be incorporated by the
macro definition mechanism, but is not intended to be
exported for knowledge representation.
REFERENCES
Aho, A., Sethi, R., and Ullman, J. (1986). Compilers: Prin-
ciples, Techniques, and Tools. Pearson Education.
Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E.,
Yergeau, F., and Cowan, J. (2006). Extensible
Markup Language (XML) 1.1 (Second Edition). W3C,
http://www.w3.org/TR/2006/REC-xml11-20060816/.
Clark and Murata (2008). Document Schema Def-
inition Language (DSDL) Part 2: Regular-
grammar-based validation RELAX NG. ISO/IEC,
http://standards.iso.org/ittf/PubliclyAvailableStandard
s/c052348 ISO IEC 19757 2 2008(E).zip.
Kingston, J. H. (1992). The design and implementation of
the lout document formatting language. Software—
Practice & Experience, 23 (9).
Kingston, J. H. (2000). The Lout Homepage.
url://savannah.nongnu.org/projects/lout.
Lamport, L. (1986). LaTeX User’s Guide and Document
Reference Manual. Addison-Wesley Publishing Com-
pany, Reading, Massachusetts.
Lepper, M., Tranc
´
on y Widemann, B., and Wieland, J.
(2001). Minimze mark-up ! Natural writing should
guide the design of textual modeling frontends. In
Conceptual Modeling ER2001, volume 2224 of
LNCS. Springer.
m4 (2000). m4 Manual. Free Software Foundation,
http://www.seindal.dk/rene/gnu/man/.
Tranc
´
on y Widemann, B. and Lepper, M. (2010).
The BandM Meta-Tools User Documentation.
http://bandm.eu/metatools/docs/usage/index.html.
KEOD 2011 - International Conference on Knowledge Engineering and Ontology Development
456