Maia: A Language for Mandatory Integrity Controls of Structured Data

Wassnaa Al-Mawee

, Paul J. Bonamy

, Steve Carr

and Jean Mayo

Department of Computer Science, Western Michigan University, 1903 W. Michigan Ave.,

Kalamazoo, MI 49008-5466, U.S.A.

Department of Computer Science, Washington State University, 14204 NE Salmon Creek Ave.,

Vancouver, WA 98686, U.S.A.

Department of Computer Science, Michigan Technological University, 1400 Townsend Dr.,

Hougton, MI 49931-1292, U.S.A.

Keywords:

Security, Structured Data Integrity, Structural Operational Semantics.

Abstract:

The integrity of systems ﬁles is necessary for the secure functioning of an operating system. Integrity is not

generally discussed in terms of complete computer systems. Instead, integrity issues tend to be either tightly

coupled to a particular domain (e.g. database constraints), or else so broad as to be useless except after the

fact (e.g. backups). Often, ﬁle integrity is determined by who modiﬁes the ﬁle or by a checksum. This paper

focuses on a general model of the internal integrity of a ﬁle. Even if a ﬁle is modiﬁed by a subject with trust

or has a valid checksum, it may not meet the speciﬁcation of a valid ﬁle. An example would be a password

ﬁle with no user assigned a user id of 0. In this paper, we describe a language called Maia that provides a

means to specify what the contents of a valid ﬁle should be. Maia can be used to specify the format and valid

properties of system conﬁguration ﬁles, PNG ﬁles and others. We give a structural operational semantics of

Maia and discuss an initial implementation within a mandatory integrity system.

1 INTRODUCTION

Integrity of data within computer systems, along with

the ongoing conﬁdentiality and availability of said

data, make up the three major components of com-

puter security. While both conﬁdentiality and avail-

ability are subjects of frequent and ongoing study, in-

tegrity is not generally discussed in terms of complete

computer systems. Instead, integrity issues tend to

be either tightly coupled to a particular domain (e.g.

database constraints), or else are so broad as to be use-

less except after the fact (e.g. backups). There are

few, if any, approaches to integrity which are capable

of actively protecting arbitrary structured data.

Our work seeks to provide robust tools to enable

general-purpose integrity protection. As part of this,

we present Maia, a language to describe integrity

constraints for arbitrary ﬁles (Bonamy et al., 2016).

In Maia, ﬁle veriﬁcation is accomplished over two

phases that correspond ﬁrst, to checking the ﬁle syn-

tax, and second, to checking its semantics. The user

provides an Extended Backus Naur Form (EBNF)

grammar to specify the ﬁle structure and extract its

syntactic elements into sets for processing. Then, the

sets are checked against integrity constraints in the

form of predicate logic.

In this paper, we give a Structural Operational Se-

mantics (SOS) for Maia (Plotkin, 1981) and report on

a preliminary implementation of a Maia compiler in

the context of a mandatory integrity system (Bonamy,

2016). The semantics give precise rules for giving

meaning to a Maia speciﬁcation. These rules show

there is no ambiguity in Maia, giving assurance that a

correct implementation of Maia ﬁle veriﬁers is possi-

ble. This allows one to implement a compiler or in-

terpreter and use Maia to specify integrity constraints

within an integrity system.

This paper is structured as follows. First, we give

an overview of related work on ﬁle integrity. Then, we

give an overview of Structural Operational Semantics.

Next, we deﬁne Maia and give its SOS. Finally, we

give a brief report on a preliminary implementation

and present our conclusions.

2 RELATED WORK

Efforts have been made in the past to implement in-

tegrity systems using existing access control mech-

anisms. This includes an approximation of Clark-

Wilson using Unix access controls (Polk, 1993), and

Al-Mawee, W., Bonamy, P., Carr, S. and Mayo, J.

Maia: A Language for Mandatory Integrity Controls of Structured Data.

DOI: 10.5220/0007344802570265

In Proceedings of the 5th International Conference on Information Systems Security and Privacy (ICISSP 2019), pages 257-265

ISBN: 978-989-758-359-9

257

approaches relying on the ﬁne-grained customizabil-

ity of DTE(Ji et al., 2006). Access control can readily

limit who may modify information, and may also be

able to enforce restrictions on which processes can

cause the changes. This provides excellent origin in-

tegrity, by restricting the source of changes. However,

pure access control systems cannot directly address

the problem of human error. Simply limiting who

may modify information does not prevent erroneous

edits. The only way to protect against such mod-

iﬁcations would be to rely on purpose-built editors

which will always make changes correctly. Thus, ac-

cess control is not sufﬁcient to address data integrity

without infallible users or special software.

Many tools exist to verify particular ﬁle formats.

XML, which is widely used online and for stor-

ing conﬁguration data, has several different veriﬁer

systems: DTD (W3C, 2008), XML Schema (W3C,

2012a) (W3C, 2012b), and RELAX NG (van der

Vlist, 2003). Tools also exist for verifying HTML,

and many reference implementations for image for-

mats include sanity checking of their input. While

these are powerful tools for protecting particular ﬁle

types, they cannot be generalized to protecting other

ﬁle formats. Instead, we need an approach that will

allow us to verify a variety of ﬁle types.

Parser generators are commonly used in devel-

oping new programming languages, and can be ap-

plied to the problem of creating veriﬁers. Lex (Lesk

and Schmidt, 1975) and Yacc (Johnson, 1975), and

their successors Flex and Bison generate robust, fast

parsers which can be embedded in C or C++ pro-

grams. ANTLR (Parr, 2015) serves a similar pur-

pose, with a focus on emitting Java rather than C code.

These tools are often sufﬁcient to produce syntax

checkers on their own, but creating semantic checks

requires detailed knowledge of the underlying parsing

technology. The ties to programming language cre-

ation that makes these parser generators fast can also

impact the set of languages they parse correctly. It

is possible to design a programming language around

the restrictions of one’s chosen parser generator, but

this is harder when the format to be parsed already

exists. PNG images (ISO, 2004), for example, make

use of chunk length speciﬁers that introduce context

sensitivities which are difﬁcult to handle in a normal

parser generator. Some tools, like YAKKER (Jim

et al., 2010), are able to cope with limited context

sensitivity, but still require programmer assistance to

perform semantic checks.

Data description languages (Fisher et al.,

2006) (Fisher et al., 2010) are designed to provide

automated parsing for ad hoc data formats. Tools like

PADS (Fisher and Walker, 2011) give programmers

the ability to describe semi-structured ﬁle formats

so that their programs can more readily access the

contents of the ﬁle. While this approach signiﬁcantly

simpliﬁes handling formats which were not designed

with parsing in mind, it still requires the intervention

of a programmer to describe the format in question

and then perform validity checks.

Maia improves on these tools in two important

ways: Maia can be used to describe any ﬁle with a

context free structure, and can handle certain types of

context sensitivity. Additionally, Maia speciﬁcations

describe valid ﬁles, not how to validate ﬁles, meaning

that no programming is required to generate a veri-

ﬁer. As we will demonstrate, tools can convert Maia

speciﬁcations into fully functional veriﬁer programs.

3 BACKGROUND

Structural Operational Semantics (SOS), introduced

by G. D. Plotkin (Plotkin, 1981), is used to specify a

framework for describing the operational behavior of

programming languages. The basic idea behind SOS

is to deﬁne the behavior of a program or a system in

mathematical terms, in a form that supports under-

standing and reasoning about the program under con-

sideration. SOS has been successfully applied as a

formal tool to give usable semantics description for

real-life programming languages including Java. SOS

is a direct approach that provides comprehensive def-

initions in a very simple formal mathematics. More-

over, SOS is the preferred choice over methods based

upon denotational semantics in the static analysis of

programs and in proving compiler correctness.

As described by Prasad and Arun-Kumar (Prasad

and Arun-Kumar, 2003), SOS deﬁnes the semantics

of a programming language from the syntax by ap-

plying the correct sequence of inference rules. Each

rule has the form

. . . ,

, (1)

where, P

represents judgments (premises or assump-

tions), C is a single judgment or conclusion, and side

conditions express the constraints of the rule. The in-

ference rule states that if all of the premises are true,

then the conclusion is true.

We present the SOS of Maia using big-step struc-

tural semantics that justiﬁes a complete execution se-

quence using a tree-structured proof. Any semantics

ICISSP 2019 - 5th International Conference on Information Systems Security and Privacy

258

of a programming language involves auxiliary enti-

ties or bindings such as environments, stores, etc. We

present the SOS of Maia with respect to a ﬁnite do-

main function called an environment, γ, that maps a

set of variables, X, to their computed values V . The

big-step transition relation ⇒

is deﬁned inductively

as the smallest relation closed under the inference

rules given a Maia rules speciﬁcation. The SOS of

Maia rules speciﬁcation has the form γ 7→ R ⇒

which is read “given an environment γ, the syntax rule

R evaluates to a value v”. This relation is understood

as a transition that leaves γ unchanged. The rules can

be expressed as a proof tree of why R can evaluate to

a value, where the goal judgment R ⇒ v is at the root,

the internal nodes represent the rule instances with a

branch for each antecedent, and the leaves are axiom

instances.

4 MAIA

In order to ensure that data remains valid we must

ﬁrst validate the original ﬁle. There are a variety of

special-purpose veriﬁers tied to particular use cases

and ﬁle formats, but no general-purpose systems for

verifying arbitrary ﬁle types. We solve this prob-

lem with Maia, a single-assignment speciﬁcation lan-

guage which can describe the structure of any context

free language as well as semantic rules for the con-

tents of a ﬁle.

4.1 Design Objectives

We have two primary objectives for the design of

Maia. First, it must be able to protect a wide vari-

ety of existing ﬁles, which means we cannot restrict

ourselves to only supporting certain ﬁle structures.

Second, Maia speciﬁcations should be descriptions of

valid ﬁles, rather than procedures for verifying ﬁles.

There are a huge number of conﬁguration ﬁles in

the average Linux system, to say nothing of all of the

user-side ﬁle formats that may be on a system. While

there are some repeated structures within these ﬁles,

any system which focused on only one ﬁle structure

would necessarily be unable to verify the remaining

formats. With that in mind, we have designed Maia

to be ﬂexible with regard to the type of ﬁles it can

describe. We also provide a mechanism to explicitly

make use of external veriﬁers if necessary.

We have also designed Maia to provide

implementation-independent descriptions of valid

ﬁles, rather than procedures for verifying ﬁles. Any

programming language could be used to write a

veriﬁer, but determining what constitutes a valid ﬁle

would require not only reasoning about the rules

themselves but also how they are implemented. By

using a description system we can separate the mean-

ing of rules from their implementation, which makes

it easier to reason about them. It also becomes much

easier to port speciﬁcations to different platforms, as

all that is required is to recreate the veriﬁer generator,

not the speciﬁcations themselves.

4.2 Model Overview

Within Maia, we model the ﬁle veriﬁcation process as

two phases, corresponding to checking the ﬁle’s syn-

tax, followed by verifying the semantics. During the

ﬁrst phase, the ﬁle is parsed to check its structure and

extract syntactic elements for processing. The sec-

ond phase can then check the data in the syntactic ele-

ments without being unduly concerned about the ﬁle’s

structure. Using this (logically) two-phase system al-

lows us to both mirror the way a traditional veriﬁer

would work and employ familiar constructs within the

language itself.

The syntax checking component of Maia is de-

signed to be familiar for anyone who has written a

parser or perused the speciﬁcation for a ﬁle or data

format. The user provides an Extended Backus Naur

Form (Backus, 1959)(Wirth, 1977) grammar which

can then be used to break the ﬁle into pieces, veri-

fying its structure and extracting meaningful compo-

nents. We also provide some limited context sensi-

tivity to allow syntax speciﬁcations to deal with ﬁles

which contain length speciﬁcations.

The semantic portion of Maia makes use of set

theory and predicate calculus to express constraints.

The sets used in this phase are automatically con-

structed during syntax checking by grouping all oc-

currences of the same nonterminal (e.g. user names

in the passwd ﬁle) into a set. It is then possible to ex-

press constraints like “user names must be unique” or

“there must be a user named root” without needing to

explicitly iterate over the data. This approach bears

some resemblance to SETL (Dewar, 1979), though

that family of languages is procedural rather than de-

scriptive. In addition to normal set operations, we also

provide a notion of ordering within sets to make it

possible to express rules “root must be the ﬁrst entry”

or “users should be ordered by UID”.

The next sections comprise a formal speciﬁcation

of the semantics of Maia. We have intentionally de-

signed the syntax and semantic speciﬁcation compo-

nents of the language to be different from one another.

This is reﬂective of the different underlying models

for syntax and semantics, and has the advantage of

making the type of a rule (syntax or semantic) obvious

Maia: A Language for Mandatory Integrity Controls of Structured Data

259

with cursory inspection. The speciﬁcation systems do

occasionally share constructs or features, and we note

those speciﬁcally. All other features are speciﬁc to ei-

ther syntax or semantic speciﬁcation and are not valid

in the other context.

4.3 SOS for Maia

A Maia speciﬁcation has the following basic struc-

ture:

M → I

∗

(X | C | S

| T )

∗

(2)

where I represents a ﬁle inclusion directive, X is an

EBNF speciﬁcation of the input ﬁle syntax, C repre-

sents a set construction operation, S

represents a se-

mantic rule and T represents a template. In the rest of

this section, we focus on the semantics for X , C and

since they are the critical elements of Maia. Maia

speciﬁcations involve rules that have no meaning to

present such as ﬁle inclusion and template deﬁnition.

For ﬁle inclusion, we deﬁne its functionality. For tem-

plates, we refer the reader to (Bonamy, 2016). .

4.3.1 File Inclusion

Inclusion brings an existing speciﬁcation into the cur-

rent speciﬁcation via the using keyword. It provides

both reusable deﬁnitions and the reﬁnement of the ex-

isting speciﬁcations. Therefore, when a path is spec-

iﬁed, and a ﬁle is included, all its syntactic speciﬁca-

tions will be available, and all of its semantic rules are

enforced. In Maia, inclusion has the form:

I → using "sysPath" ;

| using "sysPath" on "sysPath" ;

(3)

where sysPath is the path to the speciﬁcation to be

imported. The path can be relative or absolute. Nor-

mal Maia speciﬁcations produce veriﬁers which pro-

cess whatever input they are given, but this is insuf-

ﬁcient in the event that multiple ﬁles must be parsed

together. Maia supports multi-ﬁle veriﬁers by adding

extensions via the on keyword as follows:

using "sysPath" on "filePath" ;

In the example below, the Maia speciﬁcation

groupfile.maia is linked to the ﬁle /etc/group:

using "groupfile.maia" on "/etc/group" ;

Thus, as part of the current veriﬁcation process,

the ﬁle /etc/group must also be veriﬁed using the

groupfile.maia speciﬁcation.

4.3.2 Syntax Rules

Maia syntax rules are an EBNF speciﬁcation of input

ﬁle syntax. A Maia translator can emit a speciﬁca-

tion in any parser generator system to read a ﬁle. The

names that appear on the left hand side of a syntax

rule represent sets that contain the strings that match

that rule in the input ﬁle. Thus, syntax rules deﬁne

variables that are used later in constructing sets and

in verifying the properties of the constructed sets. In

Maia, items in a set are considered to be ordered based

on their original order in the ﬁle being veriﬁed.

Let G = (V, Σ, P, S) be a context-free grammar

where V is a set of variables or non-terminals, Σ is

the alphabet or set of terminals, P is a set of rules and

S is a distinguished element of V called the start sym-

bol (Sudkamp, 2006). Let n

be a node in the deriva-

tion tree, T , for the derivation S

∗

⇒ w, where w ∈ Σ

∗

and n

, . . . , n

be the children of n

. We denote the

string derived from n

as δ(n

). δ(n

) is deﬁned recur-

sively as

1. If n

is a leaf node, then δ(n

) = label(n

)

2. If n

is an interior node, then δ(n

) = δ(n

) · . . . ·

δ(n

)

Let A, B ∈ V . We denote the set of strings derived

from A as ∆(A). ∆(A) = {δ(n) | n ∈ T ∧ label(n) =

A}. In addition, we deﬁne ∆(A.B) as

∆(A.B) =



δ(n)



n ∈ T ∧ label(n) = B ∧

label(parent(B)) = A



(4)

This second form is used when referring to strings de-

rived in the context of a speciﬁc rule.

A syntax rule, X, has the form: N = xEy where

N, E ∈ V and x, y ∈ Σ

∗

∪ V . The SOS of a syntax

rule expressed in the context of a semantic rule, S

in Maia is

γ ` N= xEy ⇒

(∆(N), ∆(N.E))

γ[N 7→ ∆(N), N.E 7→ ∆(N.E)] ` S

⇒

γ ` N= xEy S

⇒

(5)

where v ∈ B and B is a boolean value indicating a ﬁle

is valid (true) or invalid (false). Essentially, syntax

rules create a new mapping from the name appearing

on the left hand side of an EBNF rule to the set of

strings that are matched in an input ﬁle.

For example, we can state the rule passwdRecord

to specify a record in /etc/passwd as:

passwdRecord = name ":" password ":" uid ":"

gid ":"

ICISSP 2019 - 5th International Conference on Information Systems Security and Privacy

260

If this rule is applied to the input:

alice: 19fd01b2307d497fb174decd8bc9c121:1000:1

bob: 0f68eb4c87c99c563e168cdc2cd92336:200:2

the constructed sets from this input are:

passwdRecord = {{alice,19fd..., 1000, 1 },

{bob,0f68..., 200, 2 }}

passwdRecord.name = {alice, bob}

passwdRecord.password = {19fd..., 0f68...}

passwdRecord.uid = {1000,200}

passwdRecord.gid= {1,2}

Sets in Maia syntax rules are constructed automat-

ically. Each set is converted into a simple or a com-

pound set containing the input chunks that matched

the parser rule. Maia syntax phase constructs simple

sets by grouping all the occurrence of the same non-

terminal together. The scope of the deﬁnition of S is

limited to the occurrences of the same variables in the

expression as follow: Lets suppose that S occurs n

times, {S

,,S

}. Then, we can deﬁne simple set S

as follows:

let S =

de f

, S

, .., S

} (6)

The SOS of a simple set deﬁnition S is:

(γ ` S

, γ ` S

, . . . , γ ` S

) ⇒

, v

, . . . , v

}=v

γ ` let S =

de f

, S

, . . . , S

} ⇒

γ[S 7→ v]

(7)

The scoping deﬁnition of simple set S, ⇒

returns a

new environment with the additional mapping.

Alternatively, Maia constructs compound sets

when nonterminals contain at least two other non-

terminals. The scope of the deﬁnition of S is

limited to the occurrences of the different vari-

ables in the expression as follows: Lets suppose

that S is a compound set that has n simple sets

,. . . ,S

}. Each simple set S occurs n times such

that {S

1,1

, . . . , S

1,n

, S

2,1

, . . . , S

2,n

, S

n,1

, . . . , S

n,n

} . Then,

we can deﬁne the compound set S as follows:

let S =

de f

{{S

1,1

, S

2,1

, . . . , S

n,1

1,2

, S

2,2

, . . . , S

n,2

. . . , {S

1,n

, S

2,n

, . . . , S

n,n

}}

(8)

The SOS of compound set S is:





{γ ` S

1,1

, γ ` S

2,1

, . . . , γ ` S

n,1

{γ ` S

1,2

, γ ` S

2,2

, . . . , γ ` S

n,2

}, . . . ,

{γ ` S

1,n

, γ ` S

2,n

, . . . , γ ` S

n,n

}





⇒





1,1

, v

2,1

, . . . , v

n,1

1,2

, v

2,2

, . . . , v

n,2

}, . . . ,

1,n

, v

2,n

, . . . , v

n,n

}





γ ` let S =

de f





{{S

1,1

, S

2,1

, . . . , S

n,1

1,2

, S

2,2

, . . . , S

n,2

. . . , {S

1,n

, S

2,n

, . . . , S

n,n

}}





⇒

γ[S 7→ v]

(9)

4.3.3 Set Construction

Set construction in Maia may be done explicitly. El-

ements are speciﬁed as a comma-separated list of ei-

ther strings or numbers. Constructed sets are available

to semantics rules. As in the case of syntax rules, set

construction rules create a mapping from the set name

to the elements of the set. In Maia syntax explicitly

constructed sets have the form:

C → Var = < Str

, . . . , Str

> ;

| Var = < Nval

, . . . , Nval

(10)

where Var is a variable name, Str is a string literal and

Nval is a numeric value. An example of explicit set

construction is

classification = < "TS", "S", "C", "UC" > ;

version = < 1, 2, 3.0, 3.1, 3.2> ;

The SOS of the explicit construction of a set of strings

is:

γ ` Var = < Str

, . . . , Str

> ⇒

{Str

, . . . , Str

γ[Var 7→ {Str

, . . . , Str

}] ` S

⇒

γ ` Var = < Str

, . . . , Str

> S

⇒

(11)

This rule indicates that the set name Var maps to the

literal set elements speciﬁed when evaluating a set of

semantic rules S

. The SOS for sets of numeric values

is similar.

Maia also provides a facility to create a new set

by performing a per-element join operation on two

or more existing sets. The sets to be joined are re-

quired to contain the same number of elements of the

same type (string or numeric value). Attempting to

join mismatched sets is considered an error.

To join sets, we reuse the angle brackets to indi-

cate set construction, though in this case we specify

how to construct an element rather than all elements

in the set. For string-based ﬁelds, the connector is a

period to indicate concatenation, in the style of Perl’s

dot operator. For example,

user = <’a’, ’b’, ’c’ >

domain = <’D1’, ’D2’, ’D3’>

userDomain = < user . domain >

results in the set userDomain mapping to the value

{’aD1’, ’aD2’, ’aD3’}.

The SOS of set join for strings is

γ ` Var = < A

. A

. . . . . A

> ⇒

1,1

· . . . · a

n,1

, . . . , a

1,m

· . . . · a

n,m

γ[Var 7→ {a

1,1

· . . . · a

n,1

, . . . , a

1,n

· . . . · a

n,m

}] ` S

⇒

γ ` Var = < A

. A

. . . . . A

> S

⇒

(12)

Maia: A Language for Mandatory Integrity Controls of Structured Data

261

where A

= {a

i,1

, a

i,2

, . . . , a

i,m

}. The SOS of set join

for numeric sets is similar.

A Maia speciﬁcation is executed in two passes.

The ﬁrst pass consists of the rules in Sections 4.3.2

and 4.3.3 to create the environment in which the se-

mantic rules given in the next section are evaluated.

The second pass veriﬁes the constraints placed on the

ﬁle contents expressed by semantic rules.

4.3.4 Semantic Rules

Structurally, Maia semantic rules are a straightfor-

ward adaptation of predicate calculus. For example,

consider the rule “there must be at least one user with

a UID of 0” that may be placed on /etc/passwd.

Given the set uid, which contains the UIDs of all users

in /etc/passwd, we may express this formally as:

∃u ∈ uid : u == 0. The equivalent Maia is quite sim-

ilar:

exists u in uid : u == 0;

A Maia semantic rule has the following syntax:

→ E? forevery? Var

in Var

: C

;

| E? exists Var

in Var

: C

;

E → (require) | (warn) | (info)

(13)

where E is an enforcement level Var is a set name

and C

is a constraint on the set. The possible en-

forcement levels are (require) which means the con-

straint is always checked and input ﬁle is invalid if

the constraint does not hold, (warn) which means the

constraint is always checked and a warning is issued

if the constraint does not hold and (info) which means

the constraint is only checked if requested and a warn-

ing message is given if the constraint does not hold.

Below, we give the SOS for the (require) enforcement

level, without loss of generality.

γ ` Var

⇒

∀a ∈ A γ[Var

7→ a] ` C

⇒

true

γ ` forevery? Var

in Var

: C

; ⇒ true

(14)

where A is a set deﬁned by a syntax rule or via set

construction and v ∈ B, and

γ ` Var

⇒

∃a ∈ A γ[Var

7→ a] ` C

⇒

false

γ ` forevery? Var

in Var

: C

; ⇒ false

(15)

These rules indicate that the constraint must hold on

every element of the set A in order for the ﬁle to be

valid. Similarly, the SOS for an exists rule is

γ ` Var

⇒

∃a ∈ A γ[Var

7→ a] ` C

⇒

true

γ ` exists Var

in Var

: C

; ⇒ true

(16)

and

γ ` Var

⇒

∀a ∈ A γ[Var

7→ a] ` C

⇒

false

γ ` exists Var

in Var

: C

; ⇒ false

(17)

These rules indicates that the constraint must hold for

at least one member of the set A in order for the ﬁle to

be valid.

A constraint, C

, in Maia may be a logical com-

parison, an expression in predicate logic, a logic con-

straint, a membership test Inclusion or a blackbox.

Syntactically, constraints are of the form

→ C

logic C

| not C

| ( C

)

| Inc

| Blb

| Cmpr

(18)

provides one or more Boolean constraints that

will be evaluated for each element in the specifying

set until the rule is satisﬁed. For example, a rule that

applies the constraint on all elements of the set, UIDs

must be in the range 0 to 32767. Maia semantic rule

translates the given rule to:

forEvery u in uid : u >=0 and u <= 32767 ;

Maia includes the standard logical operators and,

or , and xor. It also provides implies and iff . Logical

operators allow rules like:

forEvery p in passwdRecord:

p.name == root" implies p.uid==0 ;

This rule is applied to the set passwdRecord to ex-

press the constraint that the root user must have uid

equal to 0. The syntax p.name refers to the name ﬁeld

in every member of the set passwdRecord.

Set membership in Maia is speciﬁed with an in

constraint. This constraint is true if and only if there

is at least one element in a set being tested. The syntax

of set membership semantic rules is :

Inc → indexedName in setName

| indexedName in < string (, string)

∗

| indexedName in < nVal (, nVal)

∗

(19)

where nVal in Maia deﬁnes numeric values and has

the form:

nVal → iVal | f Val

(20)

where iVal is a decimal or hexadecimal integer value,

and fVal is a ﬂoating point value. The SOS of nVal

is:

γ ` iVal ⇒

iVal γ ` fVal ⇒

fVal

γ ` nVal ⇒

(21)

ICISSP 2019 - 5th International Conference on Information Systems Security and Privacy

262

where v ∈ {iVal, fVal}

In indexedName, if an element has a numeric

type, it can be compared it to a numeric literal, by

applying a numeric operator, or concatenating it to a

numeric value. In Maia , indexedName has the form:

indexedName → setName ([ exp ])? (. setName)?

| setName

(22)

For example, indexedName can access an element

in a set name as userDomain[i]. The SOS of

indexedName is deﬁned as

γ ` setName ⇒

γ(setName)

(23)

γ ` setName.setName ⇒

γ(setName.setName)

(24)

γ ` exp ⇒

γ ` setName ⇒

γ(setName)[v]

(25)

where v ∈ nVal, and setName, setName.setName ∈

dom(γ).

An example of a set membership in Maia, con-

sider the set disallowedCyphers which contains

cyphers that are not permitted under local policy.

This rule can be stated as follows:

forEvery c in cypher:

not (c in disallowedCyphers) ;

The SOS of set membership is:

γ `





∈ (indexedName) in (setName)

| e

∈ (indexedName) in < string >

| e

∈ (indexedName) in < nVal >





⇒

γ ` i f []

i=1





∈ (indexedName) in (setName)

| e

∈ (indexedName) in < string >

| e

∈ (indexedName) in < nVal >





⇒

(26)

where i, j ∈ {1, . . . , n}, and v ∈ {true, false}.

Black-box veriﬁers Blb are external procedures or

processes that can perform tasks not expressible in

Maia.

Blb(verifier, setName, ...)

In this rule, verifier is the name of a black-box ver-

iﬁer known to the system, and the setName is one or

more elements in the current context to pass to veri-

ﬁer. A black-box veriﬁer receives one or more values

and returns a single value.

A constraint C

may also include comparisons and

arithmetic operators. These operations have straight-

forward semantics which we omit for brevity. For

a more thorough discussion of black boxes, compar-

isons and arithmetic see (Bonamy, 2016).

5 EXAMPLE MAIA

SECIFICATION

Below is an example Maia speciﬁcation for protecting

the integrity of /etc/passwd.

PasswdFile = (passwdRecord Newline)+ ;

passwdRecord = name ":" password ":" uid ":"

gid ":"

gecos ":" directory ":" shell ;

name = [a-zA-Z_][-a-zA-Z0-9_]{0,31} ;

password = "*" | "x" | CryptPassword ;

uid = StringPosDec+ ;

gid = StringPosDec+ ;

gecos = [ˆ:\n]* ;

directory = [ˆ:\n]+ ;

shell = [ˆ:\n]* ;

Newline = "\n" ;

name isUnique() ;

exists name : name == "root" ;

(warn) name : name ˜ /[A-Z]/ ;

uid: uid <= 65535;

gid: gid <= 65535;

directory : directory isAbsPath() and

directory isDirectory() ;

passwdRecord : name == "root" implies uid == 0;

passwdRecord : directory isAccessibleTo(name);

passwdRecord : shell != "" implies

( shell is AbsPath() and

shell isExecutableBy(user) ) ;

6 MAIA IMPLEMENTATION

Our early approach to creating veriﬁers from Maia

speciﬁcations involves converting the speciﬁcation

into input for Flex and Bison, which then generates

a parser based on the spec. Flex and Bison are widely

used, but require a clear separation between scanning

and parsing phases (with each program handling one

phase), while Maia does not make a distinction be-

tween these cases. To overcome this obstacle, the

converter recognized string literals, character classes,

and regular expressions in Maia syntax rules and cre-

ated corresponding tokens for Flex. This approach

occasionally required hand-adjustment to the Flex in-

put, or small changes to the syntax rules themselves,

but was generally successful.

While Maia’s syntax rules are no more expres-

sive than Bison’s, they include a number of conve-

nience features which Bison lacks, such as short-

hand for repetition. As a result, the converter some-

times had to transform Maia syntax into Bison syn-

tax. This was accomplished by creating Bison non-

terminals which encapsulated the Maia behavior, like

replacing an explicit repetition (A = B{3} ;) with a

nonterminal containing the desired number of entries

(A : C ; C : B B B ;). Bison’s action system was

Maia: A Language for Mandatory Integrity Controls of Structured Data

263

used to check semantic rules. Universal rules were

checked in the action for the appropriate nonterminal,

with failures causing the parser to immediately exit.

Existential rules were also checked in actions, but set

a ﬂag if a rule passed. The ﬂags can then be checked

at the end of parsing to ensure compliance.

By converting Maia speciﬁcations into Flex and

Bison parsers in this way, we were able to create ver-

iﬁer programs for the password, shadow, and groups

ﬁles, which are part of the Linux login system. With

light modiﬁcation, we were also able to produce a sin-

gle veriﬁer that checked rules which apply across all

three ﬁles to enforce constraints like “users with an

entry in the password ﬁeld must appear in the shadow

ﬁle”. The password ﬁle veriﬁer was also used to

test integrity protection with the Linux kernel mod-

ule (Bonamy, 2016).

More examples, experimental result, and com-

parative evaluation are available in (Bonamy et al.,

2016) and (Bonamy, 2016). Speciﬁcally, we have

developed Maia spceciﬁcation for valid hashes by

crypt(), linux password, shadow and group ﬁles,

PNG images and ssh conﬁgurations.

7 CONCLUSIONS

Most integrity models deal with the trustworthiness

of who accesses the data, or provide a general pro-

tection for a speciﬁc data format. We know of no

general-purpose integrity systems capable of protect-

ing the integrity of the data itself. Research on pro-

tecting arbitrary data integrity is limited. In this paper,

we present Maia, a language for general-purpose in-

tegrity protection. We give a formal description of the

structural operational semantics for Maia using rules

with simple mathematical foundations. The seman-

tics leads to a natural interpretation of the meaning of

a Maia speciﬁcation.

We are currently implementing the full Maia in-

terpreter. Our preliminary implementation has shown

that a Maia speciﬁcation can be used to protect the in-

tegrity of Linux system conﬁguration ﬁles with mini-

mal overhead. In the future, we will build a full imple-

mentation of Maia that requires no hand modiﬁcation.

REFERENCES

Backus, J. W. (1959). The Syntax and Semantics of the Pro-

posed International Algebraic Language of the Zurich

ACM-GAMM Conference. Proceedings of the In-

ternational Comference on Information Processing,

1959, pages 125–132.

Bonamy, P., Carr, S., and Mayo, J. (2016). Toward a manda-

tory integrity protection system. In Proceedings of

the Thirty-ﬁrst International Conference on Comput-

ers and Their Applications.

Bonamy, P. J. (2016). Maia and Mandos: Tools for Integrity

Protectionon Arbitrary Files. PhD thesis, Michigan

Technological Univeristy.

Dewar, R. B. K. (1979). The SETL Programming Lan-

gauge. Courant Institute of Mathematical Sciences,

New York University.

Fisher, K., Mandelbaum, Y., and Walker, D. (2010). The

next 700 data description languages. Journal of the

ACM, 57(2):1–51.

Fisher, K., Mandelbaum, Y., Walker, D., Fisher, K., Man-

delbaum, Y., and Walker, D. (2006). The next 700

data description languages, volume 41. ACM.

Fisher, K. and Walker, D. (2011). The PADS project. In the

14th International Conference, page 11, New York,

New York, USA. ACM Press.

ISO (2004). Information technology - Computer graph-

ics and image processing - Portable Network Graph-

ics (PNG): Functional speciﬁcation. Technical Report

ISO/IEC 15948:2003 (E), Geneva, Switzerland.

Ji, Q., Qing, S., and He, Y. (2006). A formal model for

integrity protection based on dte technique. Science in

China Series F: Information Sciences, (5):545 – 565.

Jim, T., Mandelbaum, Y., and Walker, D. (2010). Semantics

and algorithms for data-dependent grammars. Pro-

ceedings of the 37th annual ACM SIGPLAN-SIGACT

symposium on Principles of programming languages,

45(1):417–430.

Johnson, S. C. (1975). Yacc: Yet Another Compiler-

Compiler. Technical Report Computing Science Tech-

nical Report No. 32, Murray Hill, New Jersey.

Lesk, M. E. and Schmidt, E. (1975). Lex - A Lexical Ana-

lyzer Generator. Technical Report Computer Science

Technical Report No. 39, Murray Hill, New Jersey.

Parr, T. (2015). The Deﬁnitive ANTLR 4 Reference. Prag-

matic Bookshelf.

Plotkin, G. D. (1981). A structural approach to operational

semantics.

Polk, W. T. (1993). Approximating Clark-Wilson“Access

Triples” with Basic UNIX Controls. In Proceedings

of the UNIX Security Symposium IV, pages 145–154.

Prasad, S. and Arun-Kumar, S. (2003). An

introduction to operational semantics.

http://www.cse.iitd.ernet.in/ sanjiva/opsem.ps.

Sudkamp, T. A. (2006). Languages and Machines: An In-

troduction to the Theory of Computer Science. Pear-

son Education.

van der Vlist, E. (2003). RELAX NG. O’Reilly Media.

W3C (2008). Extensible Markup Language (XML) 1.0

(Fifth Edition). Technical report.

W3C (2012a). W3C XML Schema Deﬁnition Language

(XSD) 1.1 Part 1: Structures. Technical report.

W3C (2012b). W3C XML Schema Deﬁnition Language

(XSD) 1.1 Part 2: Datatypes. Technical report.

ICISSP 2019 - 5th International Conference on Information Systems Security and Privacy

264

Wirth, N. (1977). What can we do about the unnecessary

diversity of notation for syntactic deﬁnitions? Com-

munications of the ACM, 20(11):822–823.

Maia: A Language for Mandatory Integrity Controls of Structured Data

265