Minimal Recursion Semantics and the Language

of Acyclic Recursion

Roussanka Loukanova

Uppsala, Sweden

Abstract. Moschovakis (2003-2006) developed a logical calculus of the formal

language L

of acyclic recursion, which is a type-theoretical work with many

potential applications. On the implementation side, large-scale grammars for hu-

man languages, e.g. versions of HPSG, have been using semantic representations

casted in the feature-value language Minimal Recursion Semantics (MRS). While

lacking strict formalization, MRS represents successfully ambiguous quantiﬁer

scoping. In this paper, we introduce the basic deﬁnitions of MRS by reﬂecting on

possibilities for formalization of MRS with a version of the language L

1 Introduction: Why MRS Representations?

Research presented in this paper targets formalization and development of analysis with

syntax-semantics interface of spoken and written human language (incl. texts larger

than sentences), which continues to be a largely open area, in need of theoretical foun-

dations for reliable coverage.

Versions of constraint-based lexicalist grammar (CBLG), in particular of Head-

Driven Phrase Structure Grammar (HPSG), have achieved signiﬁcant developments and

accumulated large resources for English and other languages, incl. for Norwegian, Dan-

ish, Arabic, etc. An international consortium, which originates by work in Stanford

developed a grammar tool, LKB, for writing grammars of human languages. In the

last years, LKB comes with possibility for semantic representation, by using Minimal

Recursion Semantics (MRS), see [3]. By its nature, MRS in HPSG is a notational,

specialized version of Situation Semantics, with feature-value structures representing

information terms.

Among the existing approaches to theory of meaning of natural language (NL),

model-theoretic alternatives provide viability of computational semantics for applica-

tions to the study of language faculty, knowledge representation in general, and in par-

ticular, for representation of linguistic knowledge, and development of intelligent com-

puterized systems. Typically, computational semantics of NL involves rendering of NL

expressions into some formal logic language. First-order languages and logic, while

well-studied and understood, have repeatedly exposed their unsuitability for semantics

of NL, from the perspectives of computability and linguistic adequacy. On the other

hand, higher order languages and typed λ-calculi have pleasant computational proper-

ties, but are still problematic from theoretic and application points as theories of mean-

ing, representation of knowledge and information ﬂow, for which they are under active

see <http://www.delph-in.net/lkb/>

Loukanova R..

Minimal Recursion Semantics and the Language of Acyclic Recursion.

DOI: 10.5220/0003309800880097

In Proceedings of the 1st International Workshop on AI Methods for Interdisciplinary Research in Language and Biology (BILC-2011), pages 88-97

ISBN: 978-989-8425-42-3

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

developments, for applications to semantics of artiﬁcial languages, e.g., for semantics

of programming languages, and for NL.

Semantic representation, including rendering of NL expressions into formal logic

languages, such as ﬁrst or higher order languages, have been problematic in systems

for NLP. The variety of such applications is large and growing: semantic transfer ap-

proaches to machine translation (MT), obtaining semantic representations by parsing

NL sentences, generation of NL sentences for given semantic representations, and other

more advancedapplications to automatic understanding of NL, for example,in question-

answer systems, information transfer, information extraction, knowledge representation

systems including knowledge inference, update, etc.

For example, a simpliﬁed semantic transfer schemata for MT typically consists of

the following stages, some of which, at least the ﬁrst two, may be carried on in a com-

positional way:

Parsing an expression of a source NL, which produces:

Semantic representation of the input NL expression in some formal language. The

semantic representation is called source LF.

A transfer component, which converts the source LF into a semantic representation,

called the target LF.

A generator converts the target LFs into expression(s) of the target NL.

In such systems, ideally, a semantic analyzer of the source NL sentences produces se-

mantic representations, called logic forms (LFs) in some formal language, to be used

for generating logically equivalent sentences in a target NL. The basic problems that

emerge are related to mismatch between LFs and NL expressions. The LF produced

by a parser typically carries on the syntactic structure of the input NL expression. For

example, the order of the atomic formulas in a LF, e.g., such as a conjunction, may

correspond to the syntactic structure of the NL expression, while it is irrelevant for the

semantic interpretation. In a simpliﬁed approach, a generator can be build to try all

logically equivalent LFs until ﬁnds the appropriate ones. Such approaches meet serious

problems, for example, involving spurious ambiguity or unacceptability; e.g., analyses

may produce various logically equivalent LFs some of which correspond to unaccept-

able NL sentences (see Copestake et al. for examples and discussion). Depending on

the formal language chosen, such approaches may inherit some serious drawbacks with

respect to computability: computational inefﬁciency and/or undecidability of the prob-

lem of logical form equivalence. Some of these problems get pleasantly resolved for a

semantic core of NL, which has a syntactic expression in NL, by a recent development

of a grammatical framework (GF), see [12] and [13].

Some of the classic semantic theories used in NLP may carry on more fundamental

problems, among which, a serious one is the quantiﬁer scope ambiguity. This is demon-

strated by any of the notorious examples, with at least two quantiﬁer NPs, like (1a),

for which there is only one classic context-free parse tree, while having more than one

possible logic forms, representing alternative scoping:

(1) a. [[Every man]

loves [a woman]

]

b. de dicto reading:

1. [every man]

[[a woman]

loves e

]

2. ∀x(man(x) → ∃y(woman(y) ∧ love(x, y)))

c. de re reading:

1. [[a woman]

[[every man]

loves e

]

2. ∃y(woman(y) ∧ ∀x(man(x) → love(x, y)))

Among the quantiﬁers and quantiﬁer scope ambiguity problems for NLP the fol-

lowing ones are ongoing:

1. A typical (context-free) parser gives only one parse tree structure, which corre-

sponds to multiple LFs, without any direct compositional way to derive them.

2. A classic style treatment of quantiﬁers (as in the lines of classic Montague’s PTQ,

see [8]) results in computational inefﬁciency, in particular, if all readings of a NL

expression with several quantiﬁers have to be derived at each level of processing

the sentence and its components.

Linguistic studies of underspeciﬁcationin human languages (NL) have been broadly

reviewed in [2]. Some approaches, closely related to the topic of this paper, have been

tried in logic type theories to represent multiple scoping by techniques for underspeci-

ﬁed representation: for an example in the line of a Montagovian approach, see [10]; for

representation of quantiﬁer ambiguities and, in general, of partiality of information in

Situation Semantics, by using Situation Theory, see [4]. A simpliﬁed version of a quan-

tiﬁer storage technique was implemented in HPSG, e.g., see [11], which in recent years

evolved in elaborated MRS representation, see [3]. More recently, a new approach has

been initiated in [5], [6], and [7].

A demonstrative example for underspeciﬁed scoping is depicted by the following

unconnected graph (“underspeciﬁed tree”), which carries information about the “bare”-

predicate structure of the sentence, where the “disconnected” quantiﬁers carry indexing

information about the corresponding subject–complement argument roles they ﬁll up.

(2) Underspeciﬁed tree (actually a graph) structure:

[every man]

, i

every man

[a woman]

, j

a woman

loves x

, S

, NP loves x

, VP

loves x

The set of the two indexed NP sub-graphs represents syntactically the ‘logic’ storage

needed for computing the resolved logic form of the sentence, e.g., see [4]. The S sub-

tree represents the underspeciﬁed semantic basis of the sentence. By this partially con-

nected graph we have a syntactical representation of the underspeciﬁed logic form:

(3) a. Quantiﬁer storage:

[every man]

, i

every man

[a woman]

, j

a woman

b. Underspeciﬁed semantic basis:

loves x

, S

, NP loves x

, VP

loves x

2 MRS Representations and Notations

MRS uses a language with n-ary conjunction. Since binary conjunction ∧ causes spu-

rious ambiguity in parsing NL expressions, for efﬁciency of processing, ∧ is taken as

an n-ary operator, which is represented implicitly: any list of atomic formulas is inter-

preted as a conjunction.

Elementary predication (EP) is any atomic formula or a conjunction of atomic for-

mulas. EPs are tagged with labels. Thus the MRS formula (4a) is represented by the

tagged tree (4b):

(4) a. every(x)



∧



big(x), white(x), horse(x)



, sleep(x)



b. h

: every(x)(h

, h

)

: big(x), white(x), horse(x) h

: sleep(x)

The above MRS representations (4a) and (4b) can be written in the following nota-

tion, by using assignments to resemble terms in Moschovakis language L

(5) h

where { h

:= every (x, h

, h

:= ∧(big(x), white(x), horse(x)), h

:= sleep(x) }

Note that, in Moschovakis’ calculus of L

, coordination terms containing conjunction

(the value of h

above) undergo further reduction to canonical forms. In a resolved

term, the head subterm does not need to be assigned to any location like h

above. In a

realistic NL grammar, top locations (labels) simplify the compositional derivations, but

add “computational” steps.

MRS uses three kinds of variables:

1. Variables, called also parameters, for quantiﬁcation over and reference to indi-

viduals denoted by NPs and for ﬁlling up argument slots of relations, when the

arguments designate individuals;

2. Labels, for tagging EPs and ﬁlling up argument slots of relations, when the argu-

ments are for EPs;

3. Free labels for labels to which no EPs are assigned.

Respectively, the language L

has two sorts of variables:

1. Pure variables, which are to be quantiﬁed, i.e. corresponding to the MRS variables

for individuals;

2. Recursion (location) variables to which EPs are assigned, i.e. these correspond to

the MRS labels. For example: h := sleep(x), where h is a location variable, i.e. a

label, and x is a pure variable.

Different MRS Representations of the NL Sentence.

(6)

Every dog chases some white cat.

(7) a. De Re reading, in predicate language:

some(y)[(white(y)∧ cat(y)) ∧

every(x) (dog (x) → chase(x, y))]

b. De Re reading, by a MRS tree:

: some(y)

: white(y), cat(y) h

: every (x)

: dog(x) h

: chase(x, y)

c. De Re reading, in MRS term:

: some(y, h

, h

), h

: white(y), cat(y),

: every(x, h

, h

), h

: dog(x), h

: chase(x, y)

d. De Re reading, in the language L

where{ h

:= some(h

, h

)

:= λy (p(y)&q(y)),

p := λy white(y), q := λy cat(y),

:= λy every(h

(y), h

(y)),

:= λy λx dog(x), h

:= λy λx chase(x, y)}

(8) a. De Dicto reading, in predicate language:

every(x)[dog(x), some(y)(∧(white(y), cat(y)), chase(x, y))]

b. De Dicto reading, in MRS:

: every(x)

: dog(x) h

: some(y)

: white(y), cat(y) h

: chase(x, y)

c. De Dicto reading, by a MRS term:

: every(x, h

, h

), h

: dog(x),

: some(y, h

, h

), h

: white(y), cat(y), h

: chase(x, y)

d. De Dicto reading, in the language L

where { h

:= every (h

, h

), h

:= λx dog (x),

:= λx some(h

, h

(x)), h

:= λy (p(y)&q(y)),

p := λy white(y), q := λy cat(y),

:= λxλy chase(x, y)}

Underspeciﬁed Representation. The underspeciﬁed quantiﬁcation can be depicted by

the labeled graph with nodes that are unconnected:

(9) a. Underspeciﬁed MRS graph:

•h

: every(x)

•h

: dog(x) •h

•h

: some(y)

•h

: white(y), cat(y) •h

•h

: chase(x, y)

b. Underspeciﬁed MRS term:

: every(x, h

, h

), h

: dog(x),

: some(y, h

, h

), h

: white(y), cat(y),

: chase(x, y)

There are exactly two ways of assigning EPs to the variables h

and h

to form

well formed MRS terms and corresponding tree representations I.e., the system

of equations has exactly two solutions corresponding to:

1. h

: h

, h

: h

;

2. h

: h

, h

: h

Note that, in MRS, the variables h

and h

are called handles.

c. Underspeciﬁed L

-term: (for example, by using the rules of the reduction

calculus)

(u) where { h

:= λy every(h

, λx h

(x)(y)),

:= λx some(h

, λy h

(x)(y)), h

:= dog,

:= λy (p(y)&q(y)), p := λy white(y), q := λy cat(y),

:= λxλy chase(x, y)}

This system of equations has exactly two solutions, in case it is extended, cor-

respondingly, by adding the following assignments inside the scope of the re-

cursion operator where:

1. h

:= h

, h

:= λxλz h

(x), h

:= h

;

2. h

:= h

, h

:= λz h

, h

:= h

3 Basic Deﬁnitions of MRS

I will introduce MRS as currently given in Copestake et al., but in Moschovakis terms.

A language of MRS includes, along other symbols and types, a set Rel of relation

symbols. MRS, by using Moschovakis terminology, has two sorts of variables:

Deﬁnition 1 (Variables).

– a set of pure variables:

pure

= x, y, z, . . .

Pure variables are to be quantiﬁed and for reference to individuals.

– a set of recursion or variables:

rec

= h

, h

, . . .

Recursion variables are called also location in L

, or labels and handles in MRS.

In the above informal introduction of MRS representations, it is clear that the vari-

ables called handles and labels are of the same formal sort, similar to location variables

in the language L

of recursion. Notationally, in the MRS syntactic constructs, the

distinction between handles and labels is with respect to the positions taken by these

variables. Note that in MRS, ‘location’ variables are used for labeling EPs and for ﬁll-

ing up scopal argument slots of relation symbols. In L

, location variables are used in

the construction of recursive terms.

Deﬁnition 2 (Elementary predication (EP)). This deﬁnition corresponds closely to the

one given in Copestake et al. (p.12):

label : relation(arg

, . . . , arg

, sc arg

, . . . , sc arg

), where

1. label ∈ V

rec

and is called the label of the EP;

2. relation ∈ Rel is an (n + m)-argument relation symbol;

3. arg

, . . . , arg

∈ V

pure

are the ordinary variable (i.e. non-scopal) arguments of

relation;

4. sc arg

, . . . , sc arg

∈ V

rec

are the scopal arguments of relation.

Examples:

(10) a. h : every(y, h

, h

)

b. h : sleep(x)

c. h : probably(h)

Now, by considering the examples given in the paper, the above deﬁnition implies a

wrong interpretation of the quantiﬁer symbols like some and every as 3-place argument

relations.

MRS has no λ-abstraction terms and no types corresponding to those of typed λ-

calculus. Versions of CBLG similar to HPSG, use a SEM feature INDEX which, up to

some extend, corresponds to λ-abstraction.

Revised

Deﬁnition 2 (Elementary predication)

h : relation(a

, . . . , a

, h

, . . . , h

), where

1. h ∈ V

rec

and is called the label of the EP;

2. relation ∈ Rel is a relation symbol;

3. a

, . . . , a

∈ V

pure

are variables which either ﬁll up argument slots of the relation

symbol relation or, in case of a quantiﬁer relation, are the variables bound by it;

4. h

, . . . , h

∈ V

rec

are variables ﬁlling up the scopal arguments slots of relation.

Some Abbreviations and Notations: In MRS, If a variable h ∈ V

rec

labels an EP, it

is called a label; if h ∈ V

rec

ﬁlls up an argument position of a relation, it is called a

handle. A bag of EPs that have the same label is interpreted as a conjunction. A bag of

co-labeled EPs h : E1, . . . , h : E

is denoted by h : E1, . . . , E

I am giving a minimal revision in order to keep this introduction to MRS close to Copestake

et al.

Deﬁnition 3 (Immediate Outscoping Relation between EPs). Given a bag of EPs M ,

and two EPs E, E

′

∈ M , E immediately outscopes E

′

iff one of the scopal arguments

of E is identical to the label of E

′

. I.e., for some p, p

′

∈ Rel and l, h ∈ V

rec

, E = l :

p(. . . , h , . . .) immediately outscopes E

′

= h : p

′

(. . .) and it is said that l immediately

outscopes h and E

′

Deﬁnition 4. Given two conjunctions of EPs M and M

′

, M immediately outscopes

′

iff there are E ∈ M and E

′

∈ M

′

such that E immediately outscopes E

′

Deﬁnition 5 (Outscoping Relation). Outscoping relation over a set of EPs is the tran-

sitive closure of the immediate outscoping relation between EPs in that set.

Deﬁnition 6 (MRS Structure). A MRS structure is a tuple hGT , LT , R, C i, where

– R is a bag of EPs.

– GT ∈ V

rec

is a label (recursion variable) such that there is no label h ∈ V

rec

which

occurs in R and outscopes GT . GT is called the global top of M.

– LT ∈ V

rec

is the topmost label in R, with respect to the outscoping relation over

the labels in R, and which is not the label of a ﬂoating (see later) EP. LT is called

the local top of M .

– C is a set of constraints (introduced later on) satisﬁed by the outscoping order in

Deﬁnition 7 (Scope-resolved MRS Structure). A scope-resolved MRS structure is an

MRS structure such that:

1. The MRS structure forms a tree of EP conjunctions, where dominance is deter-

mined by the outscope ordering on EP conjunctions.

2. The global and local top labels and all handle arguments (i.e. recursion arguments)

are identiﬁed with an EP label (in Moschovakis terminology: there are no free re-

cursion variables).

4 Conclusions: Advancing New Developments

Arguments for “Flat” Semantic Representation that Pends Further Development.

The original arguments for introducing MRS representation have been efﬁciency with-

out loss of information. Further theoretic and development work is needed for the fol-

lowing initiations:

Underspeciﬁcation. MRS permits underspeciﬁcation in representations of quantiﬁer

scopes so that a single MRS construct represents multiple scopes without loss of gram-

matical information available in the structure of NL expressions.

Flat Semantic Representation. “Flat” MRS representations, which consist of the most

basic facts, without loss of information, improve the efﬁciency of NLP. For example,

ﬂat representations have been favored in systems such as Generation from semantic

representations and machine translation with semantic transfer.

Currently, MRS is in development stage for use in HPSG. For example, MRS has

been extensively implemented in grammars for English, Norwegian and Danish by us-

ing LKB.

Further Development of MRS and L

Representation.

Syntax-Semantics Interface. MRS representations can be written in a feature-value

language for formal and computational grammar of NL, such asCBLG (e.g., HPSG).

CBLG incorporates several linguistic components: vocabulary, lexicon, syntax, and se-

mantic representations, in a uniﬁed way, which is a basis for compositional syntax-

semantics interface.

Logic Foundation. MRS offers semantic representations which are close to the canon-

ical forms of terms in the formal language of Acyclic Recursion. This gives opportu-

nities to formalize MRS and develop new implementations by re-using existing CBLG

resources.

Rendering. Render relation is the translation from NL into the language L

of acyclic

recursion. Feature-value lexicalist approach to grammar theory, such as CBLG (HPSG,

LFG, etc.), is a good computational approach to syntax, which provides procedures for

rendering into logic forms and is linguistically and semantically adequate.

Indexing. Indexing procedure from NL into L

can be provided, e.g., by appropriately

development of Binding Theory in CBLG (HPSG)

With appropriate adjustments of MRS, Moschovakis Acyclic Recursion can provide

appropriate formalization of MRS. Using a version of Acyclic Recursion can contribute

to developing MRS itself, for example, to:

1. formal representation of quantiﬁers;

2. representation of abstraction in MRS (resembling λ-abstraction);

3. ﬁnding a better incorporation of utterance and described situations into the MRS

expressions (and MRS feature structures used in CBLG (HPSG)).

4. representing higher order relations for modiﬁers that are not conjunctively inter-

preted: alleged, former, ...;

5. representing higher order relations denoted by lexemes creating oblique contexts,

such as

know, believe, ...

Relation to other Type-theoretic Developments for NLP. In recent years, a pow-

erful type-theoretical grammar formalism for natural, i.e. human, language processing

(NLP), Grammatical Framework (GF), see [12] and [13], has been under active devel-

opment. Future work, which is tightly related to the subject of this paper, is research

on the placement of GF in the family of CBLG, with respect to syntax, semantics, and

syntax-semantics inter-dependencies, its theoretic foundations, and applications.

The work on computational semantics presented in this paper is in a direction of

theoretical developments, for providing foundations for, and extending, current appli-

cations in the lines of CBLG (e.g., HPSG and GF), and for new ones.

References

1. Barwise, J., Perry, J.: Situations and Attitudes. Cambridge, MA: MIT press. Republished in

1999 by The David Hume Series of Philosophy and Cognitive Science Reissues. (1983)

2. Bunt, H.: Semantic Underspeciﬁcation: Which Technique for What Purpose? In: Bunt, H.,

Muskens, R. (eds.): Computing Meaning, Vol. 3. Studies in Linguistics and Philosophy 83.

Springer, Dordrecht (2007) 55–85

3. Copestake, A., Flickinger, D., Pollard, C., Sag, I. A.: Minimal Recursion Semantics: an In-

troduction. Research on Language and Computation 3.4 (2006) 281–332

4. Loukanova, R.: Generalized Quantiﬁcation in Situation Semantics. In: Gelbukh, A. (ed.):

Computational Linguistics and Intelligent Text Processing. Lecture Notes in Computer Sci-

ence, Vol. 2276. Springer Berlin Heidelberg (2002) 173–210

5. Loukanova, R.: Typed Lambda Language of Acyclic Recursion and Scope Underspeciﬁca-

tion. In: Muskens R. (Ed.) Workshop on New Directions in Type-theoretic Grammars. (2007)

73–89

6. Loukanova, R.: Constraint Based Grammar and Semantic Representation with The Language

of Acyclic Recursion. In: Bel-Enguix, G., Jim´enez-L´opez M.D. (Eds.): Proceedings of the

International Workshop on Non-classical Formal Languages in Linguistics. (2008) 57–70

7. Loukanova, R.: Semantics with the Language of Acyclic Recursion in Constraint-Based

Grammar. In: Bel-Enguix, G., Jim´enez-L´opez M. D. (Eds.): Bio-Inspired Models for Nat-

ural and Formal Languages. Cambridge Scholars Publishing. (to appear)

8. Montague, R.: Formal Philosophy. Thomason, R. H. (ed.): Selected papers of Richard Mon-

tague. Yale University Press. New Haven and London. (1974)

9. Moschovakis, Y. N.: A logical calculus of meaning and synonymy. Linguistics and Philoso-

phy, Vol. 29 (2006) 27–89

10. Muskens, R.: Underspeciﬁed semantics. In: Egli, U, von Heusinger, K. (ed.): Reference and

Anaphoric Relations. Studies in Linguistics and Philosophy, Vol. 72. Kluwer (1999) 311–338

11. Pollard, C., Sag, I. A.: Head-Driven Phrase Structure Grammar. Chicago: University of

Chicago Press (1994)

12. Ranta, A.: Grammatical framework: A type-theoretical grammar formalism. Journal of Func-

tional Programming, 44 (2004) 12–36

13. Ranta, A.: The GF Resource Grammar Library. Linguistic Issues in Language Technology,

2(2). (2009)

14. Sag, I. A., Wasow, Th., Bender, E. M.: Syntactic Theory — A Formal Introduction. 2nd edn.

Stanford: CSLI Publications (2003)