The Recursion Scheme of the Trace Function Method

Baltasar Tranc

on y Widemann

Department of Computer Science, University of Bayreuth, 95440 Bayreuth, Germany

Keywords:

Formal Method, Trace Function Method, Executable Semantics, Recursion Theory, State System.

Abstract:

The Trace Function Method (TFM) is a fundamental approach to the description of system behavior for re-

quirements analysis, speciﬁcation, and documentation. External behavior of systems or components is given in

mathematically direct form, but with full abstraction from internal state, by deﬁning output at discrete interface

events as recursive functions of the complete history of previous interaction at the same interface, including

both input and output. In order to understand and evaluate the semantics of the notation, and in particular the

executable semantics, that is, the potential for automatic simulation and construction of prototype implemen-

tations from TFM descriptions, a recursion-theoretic analysis is given. It is demonstrated that a single run and

the full reactive behavior of a TFM description can be presented as instances of ﬁrst-order and higher-order

course-of-value iteration, respectively. A simple sufﬁcient condition for correct implementations of TFM de-

scriptions in terms of state systems is given. The spectrum of possible state-based implementations of a TFM

description, ranging from straightforward simulation to minimized state space, is explored. Implications for

semantically calculated and hence formally veriﬁable prototype implementations are summarized.

1 INTRODUCTION

The trace function method (TFM), due to Parnas, is a

fundamental approach to the description of system be-

havior. Its applications in software engineering range

from requirements analysis (partial descriptions of re-

quired behavior) (Quinn et al., 2006), via design spec-

iﬁcation (complete descriptions of required behavior)

(Baber et al., 2005; Liu et al., 2010) and simulation

(Tranc

on y Widemann and Parnas, 2008) to documen-

tation (descriptions of observed behavior at various

levels of completeness and abstraction). As such, it

is an integral part of the rational development process

envisaged by (Parnas, 2009).

TFM abstracts fully from the internal structure

and state of a system or component. All described

behavior consists of discrete events at an interface. In-

teraction on events is speciﬁed by valuations of inter-

face variables. Every interface variable is either input

(controlled by the environment) or output (controlled

by the system). The complete relevant history of be-

havior at the interface, organized as a list of events,

most recent event ﬁrst

is called a trace. A trace func-

tion is a total function

, that maps traces to output

The original TFM style has the oldest event ﬁrst. The

reversal reﬂects the recursive structure better and will be

justiﬁed in section 2.

values. The behavior of the system is described com-

pletely by giving a trace function for each output vari-

able of the interface.

A trace that contains both input and output serves

as precise documentation of a single instance of sys-

tem behavior, independently of the trace functions

that specify the general rules of behavior. A trace can

be validated by comparing recorded outputs in the

trace with speciﬁed outputs from trace functions; or

it can be reconstructed from its input only, by recur-

sively ﬁlling in speciﬁed outputs, oldest events ﬁrst.

Thus trace functions can act as either test oracles,

or simulators or prototype implementations, respec-

tively. In either case, there is a logical redundancy

between historical records and abstract descriptions.

The present article is an explication of the mathemat-

ical consequences of that redundancy. It is necessary

to understand the way in which the redundancy is re-

solved, in order to give precise executable semantics

to TFM, and hence derive correct implementations in

a formally satisfactory manner.

There is a relational extension of TFM that uses trace

relations to describe nondeterministic behavior, which is not

discussed here; the present arguments may be extended to

cover the relational case nevertheless.

146

Trancón y Widemann B..

The Recursion Scheme of the Trace Function Method.

DOI: 10.5220/0003994501460155

In Proceedings of the 7th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE-2012), pages 146-155

ISBN: 978-989-8565-13-6

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

1.1 The Dilemma of Historical Records

Formally, output values of past events are recorded

in a trace, even though they are redundant and can

in principle be calculated from the output trace func-

tions. That opens the syntactic possibility to have ar-

bitrary pairings of input and output in trace events,

not merely those that can actually be produced by the

system and make up the graphs of the output trace

functions. In philosophical terms, we may speak of

pairings that arise from the given output functions as

factual, and of those that do not as counterfactual.

Then the question arises how to deal with the distinc-

tion when deﬁning an output function: Should one as-

sume the output values recorded in the trace to be fac-

tual, and retrieve them trustfully for subsequent com-

putations? Or should one rather give a “skeptic” deﬁ-

nition that calculates previous outputs recursively? Is

there a semantic difference? Is there a difference in

terms of system implementation effort and efﬁciency?

The purpose of the present article is to use established

mathematical theory to answer these questions.

1.2 Motivating Example and Discussion

As a running example, consider the case of a compo-

nent with ﬁnite multiset functionality, keeping track

of multiplicities of elements of some ﬁnite set X. The

component supports four operations, namely count-

ing the multiplicity of a given element, increasing the

multiplicity of a given element by one up to an up-

per bound B, decreasing the multiplicity of a given

element by one if present, and setting the multiplic-

ity of all elements simultaneously to zero. Thus the

input part of each event can be speciﬁed by a pair

(p,x) ∈ P × X, where P = {cnt,inc,dec,clr}, where

x is irrelevant for p = clr. The initial multiplicity of

any element is zero. The output part of each event

is an integer between zero and B. The TFM descrip-

tion of the component is depicted in Figure 1.

The

trace function takes the “skeptic” approach discussed

above, and assumes only input values are present in

a trace. The variable T ranges over traces. The con-

stant ε denotes the empty trace; an event prepended to

a trace is denoted as e · T (cf. section 2.3). The auxil-

iary operation T  x removes from a trace T the preﬁx

of all events not concerning element x. Thus the result

is always a sublist, or tail, of T.

The trace function bag depends recursively on

its own result for some strictly reduced arguments,

although the amount of reduction depends nontriv-

ially and dynamically on the pattern of usage. But

Figures are grouped on the last page for comparison.

apart from the recursion scheme, only very elemen-

tary mathematics are used.

Intuitively, there should

be a unique function bag that solves these recursive

equations, and there should be a straightforward algo-

rithm which effectively computes that function, per-

haps not with optimal efﬁciency, but viable as a sim-

ulation, rapid prototype or test oracle.

1.3 Enter Recursion Theory

Since the expressive power of TFM comes from the

way past outputs are used recursively in the compu-

tation of present outputs, it seems only fair to turn

to recursion theory for the analysis of the method.

We shall demonstrate that the adequate scheme of re-

cursive function deﬁnition is known theoretically as

course-of-value (cov) iteration. The phrase “course

of value”, or “course of values”, can be traced back to

the works of Frege, where it is roughly synonymous

with “extension”.

The difference between cov iteration and the more

familiar scheme of primitive recursion is illustrated

by analogy with proof by induction: In order to prove

P(n) for all natural numbers n, ordinary complete in-

duction uses a step of the form P(n) ⇒ P(n +1) plus a

base case, often P(1) or P(0). An alternative, equiva-

lent method uses a step



P(k) for all k < n



⇒ P(n)

instead. This method is often more convenient to

use, and generalizes to transﬁnite induction, where

it is called the Noetherian induction scheme. Prim-

itive recursion is similar to ordinary induction, in us-

ing just a single preceding instance to infer the next,

whereas cov iteration is free to use all preceding in-

stances. Analogous arguments regarding (unchanged)

absolute power and (improved) convenience apply.

The use of cov iteration as a theoretical tool for

“intentional” program analysis has been proposed re-

cently (Bonfante, 2011), although we currently re-

serve judgement on the relevance for our own investi-

gations.

Note that the auxiliary function  is also recur-

sive, but of a simpler form that will be formalized

as ordinary iteration below. The presence of recur-

sion also in the way a trace function accesses events

is responsible for the complexity of the multiset ex-

ample. TFM employs a variety of access operations,

ranging from static (splitting a trace into most re-

cent event and rest), via semi-dynamic (splitting at

the most recent event from a pre-determined set) to

fully dynamic (splitting at the most recent event that

satisﬁes an arbitrary relation with current input; ﬁl-

tering certain events from a trace). The choice of ac-

The reader is indeed invited to try and come up with a

simpler, precisely equivalent description.

TheRecursionSchemeoftheTraceFunctionMethod

147

cess operations is made on a pragmatic, ad-hoc basis.

Below we shall give results that indicate vastly differ-

ent implementation costs for different classes of ac-

cess. Hence it may be prudent, and in accordance with

the principle of Occam’s Razor, to classify access op-

erations accordingly, and to prefer light-weight over

heavy-weight ones where applicable.

2 STRUCTURED RECURSION

THEORY

The classical theory of recursive functions comes in

two subtly different variants that must not be con-

funded. The theory of µ-recursive functions deals

with partial functions for which computation may di-

verge. The theory of primitive recursive functions

deals with total computable functions, a strict subset

of the former. Since µ-recursion is Turing-complete,

the question whether a µ-recursive function is deﬁned

for an argument is undecidable. Hence they are a

model of computation rather than of abstract behav-

ioral description. Undecidability, and partial func-

tions that do not terminate, are unsuitable for exe-

cutable semantics of TFM, where a two-valued predi-

cate logic that may speak about deﬁnedness is used in

function deﬁnitions (Parnas, 1993), but descriptions

must be decidable in order to be useful.

Turner has made a similar remark with regard to

the algebraic analysis of functional programs:

The existing model of functional program-

ming, although elegant and powerful, is com-

promised to a greater extent than is commonly

recognised by the presence of partial func-

tions. (Turner, 2004)

As a corollary, an implementation of TFM in a

Turing-complete (partial) functional language such as

ML is not a simple matter of writing down the equa-

tions in a different syntax. The mismatch between

partial and total functions pervades all formal reason-

ing, and one is hardly better off in terms of correct-

ness than, say, with a C++ implementation. When

full-scale veriﬁcation of closed programs is out of

the question, such as in the proposed agile deriva-

tion of oracles, simulators and prototypes, a disci-

plined approach to code generation is needed. We

shall demostrate that appropriate constraints can be

calculated from recursion-theoretic investigations; the

choice of the actual back-end language is secondary.

Turner’s conclusion is to focus on the special case

of total recursive functions by way of theoretically

informed, disciplined function deﬁnition styles. Un-

fortunately, primitive recursion on the natural num-

bers, the subject of the classical total theory, is a

natural format for a small class of functions only.

Some other functions can be encoded in obfuscated

ways such as G

odel coding, while many cannot, the

most famous example being the Ackermann function.

Hence the strengthening of both recursion schemes

and datatypes has been given much attention. An im-

portant case in point is the Bird–Meertens formalism

(Bird and de Moor, 1997), also known derogatorily as

Squiggol, that focuses on recursion schemes for list-

like datatypes. It is a useful tool in the analysis and

veriﬁcation of recursive functional algorithms, and in

the calculation of correct-by-construction functional

programs. As such it serves as the role model for our

present investigations of TFM.

A fairly general theory of recursive functions with

a recursion scheme able to accommodate TFM has

been given in terms of universal categorial algebra

and coalgebra (Uustalu and Vene, 1999). Since only a

special case is needed for the present discussion, the

relevant notions are summarized in this section in a

less general, but more accessible form. No knowl-

edge of category theory is presupposed. Novel and

TFM-speciﬁc results are given in the following sec-

tions.

Universal algebra, the mathematical foundation of

algebraic speciﬁcation, is based on three notions:

• a signature introduces operations with speciﬁed

argument numbers and types,

• algebras are set-theoretic models of signatures,

realizing the operations as functions on some car-

rier set,

• homomorphisms are functions between the carrier

sets of two algebras that are compatible with the

realizations of operations in either algebra.

It is quite natural to represent universal algebra

in category theory, due to the observation that signa-

tures and homomorphisms can be deﬁned simultane-

ously as a functor on the category of sets. Such a

functor is a mapping T that takes any set A to an-

other set T (A) and each total function f : A → B

to another total function T ( f ) : T (A) → T (B), with

the side conditions that identical functions are taken

to identical functions, and compositions to composi-

tions: T (id

) = id

T (A)

and T (g ◦ f ) = T (g) ◦ T ( f ),

where (g ◦ f )(x) = g



f (x)



2.1 Algebras and Iteration

For a functor T that expresses a signature, the T -

algebras are pairs (A, α) of a carrier set A and a com-

bined realization of operations α : T (A) → A. A T-

algebra homomorphism between two algebras (A, α)

ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

148

and (B, β) is a function h : A → B such that h ◦ α =

β ◦ T (h). The well-known result that the term alge-

bra of a signature has a unique homomorphism to

any other algebra, understood as bottom-up term eval-

uation, is rephrased categorially as an initial object

in the category of T-algebras: There is a T-algebra

(µT, in

) such that a unique homomorphism (|α|)

any other T -algebra (A,α) can be found.

5 6

Lambek’s

Lemma states that in

is bijective, that is µT is a ﬁx-

point of the domain equation T(X)

∼

X, and in fact

the least one.

As our ﬁrst example, consider the Peano signature

of natural numbers. It is traditionally given as two

operations zero (nullary) and succ (unary). The same

can be expressed by a functor N that takes a set A to

the set 1 + A, the disjoint union of a singleton set 1

and A. We write  for the injection of the element of

1 and a

for the injection of an element a ∈ A. The

function part of the functor is given by N( f )() = 

and N( f )(x

) = f (x)

The obvious interpretation as natural numbers

with the number zero and the successor function

can be turned into an initial N-algebra, by virtue of

Peano’s axioms: µN = N, in() = 0 and in(n

) = n +

1. For any N-algebra (A,α), the function α : 1 +A →

A can be decomposed into a constant z ∈ A such that

α() = z, and a function s : A → A such that α(a

) =

s(a). The unique homomorphism (|α|) : N → A satis-

ﬁes (|α|)(0) = (|α|)



in()



= α



N(|α|)()



= α() =

z, and (|α|)(n + 1) = (|α|)



in(n

)



= α



N(|α|)(n

)





(|α|)(n)



= s



(|α|)(n)



. It can then be shown

that (|α|) computes the iteration of s starting from

z: (|α|)(n) = s

(z). For instance, for the N-algebra

(N,α) with α() = 1 and α(n

) = 2n, the homomor-

phism (|α|) computes the powers of two.

In imperative programming languages, iterations

feature as the semantics of certain side-effect free

loops: z, s and n denote the initial state, loop body

effect (as a state update) and number of iterations, re-

spectively. The intended result is the ﬁnal state.

2.2 Coalgebras and Coiteration

As usual in category theory, the dual of a construc-

tion, obtained by reversing arrows, is also studied.

The dual of a T -algebra is a T -coalgebra, a pair

(C, γ) of a carrier set C and a realization of opera-

tions γ : C → T (C). A T-coalgebra homomorphism

h between T -coalgebras (C, γ) and (D, δ) is a func-

tion h : C → D such that T (h) ◦ γ = δ ◦ h. The dual

The “banana” bracket is due to (Meijer et al., 1991).

Subscripts indicating the functor will be dropped where

no confusion arises.

of an initial algebra is a ﬁnal coalgebra: There is a

T -coalgebra (νT, out

) such that a unique homomor-

phism [(γ)]

from any T -coalgebra (C, γ) can be found.

Dually to Lambek’s Lemma, out

is also bijective,

and νT the greatest ﬁxpoint of T (X )

∼

The ﬁnal coalgebra view leads to an alternative

interpretation of the Peano signature: N-coalgebras

with realizations of operations of the type γ : C →

1 +C can be understood as partial functions γ : C 9

C, where a result of  or n

denotes undeﬁnedness

or the value n, respectively. The set N = N + {∞}

and the predecessor function can be turned into a ﬁ-

nal N-coalgebra: νN = N, out(0) = , out(n + 1) =

and out(∞) = ∞

. The unique homomorphism

[(γ)] : C → N satisﬁes [(γ)](x) = 0 if γ(x) = , and

[(γ)](x) = [(γ)]



γ(x)



+ 1 otherwise, with the special

case [(γ)](x) = ∞ if γ can be iterated indeﬁnitely on

x. It can then be shown that [(γ)] computes the coiter-

ation of γ: [(γ)](c) = min{n ∈ N | γ

n+1

powers of γ are obtained by strict composition of par-

tial functions, and min ∅ = ∞. For instance, the iter-

ated logarithm log

∗

, as deﬁned in algorithmic com-

plexity theory, is the coiteration of the N-coalgebra

(R,log), where the function logx is restricted to argu-

ments x > 1.

In imperative programming languages, coitera-

tions feature as the semantics of certain side-effect

free loops: The partial function γ denotes a combined

state update and loop post-condition, starting from

initial state c. The intended result is the number of

iterations until the condition fails, with the possibility

of non-termination.

2.3 From Numbers to Lists

The (ﬁnite or inﬁnite) lists of elements from a set A

are traditionally structured using the operations nil

and cons

. We abbreviate nil

to ε and cons

(a,) to

a ·, respectively.

The corresponding functor is L

(X) = 1 + (A ×

X) on sets, and L

( f )() =  and L

( f )



(a,)





a, f ()



on functions. The ﬁnite lists constitute

an initial algebra: νL

= A

∗

with in

() = ε and



(a,)



= a ·. Dually, the ﬁnite and inﬁnite lists

constitute a ﬁnal coalgebra: νL

= A

∞

= A

∗

∪A

with

out

(ε) =  and out

(a ·) = (a,)

The similar functor K

(X) = A × X is not very

interesting from the algebraic viewpoint, because its

initial algebra is empty. By contrast, the inﬁnite

lists constitute a ﬁnal coalgebra: νK

= A

with

out

(a ·) = (a,).

For numerous examples of iterative and coitera-

tive functions on lists, their analysis and synthesis, see

(Bird and de Moor, 1997).

TheRecursionSchemeoftheTraceFunctionMethod

149

2.4 Generalizations

Iteration can be understood as a technique for the in-

ductive deﬁnition of a function with initial algebra do-

main, (|α|)

: µT → A, by giving a T -algebra (A, α)

that deﬁnes one recursive step of the function. In the

signature functor T , base and induction cases are dealt

with together. The technique can easily be seen to

subsume, for the Peano functor N, classical complete

induction. The deﬁning form α(a) is allowed to use

only the recursive function results for immediate sub-

arguments, a − 1 in the case of N.

Not all functions are conveniently presented in

this form, though: for instance, the factorial function

fac(n) requires both the recursive result fac(n − 1)

and the original argument n, and the Fibonacci func-

tion ﬁb(n) requires two recursive results, ﬁb(n − 1)

and ﬁb(n − 2). Consequently, more complicated pat-

terns of recursion than the one catered for by iteration

are also studied. Recursive functions that work like

fac are handled with the primitive recursion pattern

which, for the Peano functor, is the classical form of

total recursion theory. Recursive functions that work

like ﬁb, or more generally, require recursive function

results for arbitrary simpler arguments, are handled

with the course-of-value iteration pattern, which we

shall focus on for the following investigations.

Note that, in a setting with arbitrary datatypes de-

ﬁnable as functors, all total recursion schemes are es-

sentially equivalent in the sense that they can simu-

late each other in extended datatypes. Note also that

higher-order functions, operating on datatypes con-

taining computable functions, are strictly more pow-

erful than the ﬁrst-order functions, which are dealt

with comprehensively in the classical theory via nat-

ural numbers and G

odel encoding. For instance, the

archetypical example of a computable function that

is not primitively recursive in the classical sense, the

Ackermann function ack, turns out to be not just prim-

itively recursive, but even iterative, in a higher-order

setting: Consider the higher-order N-algebra ([N →

N],α) over the space of iterative functions on N,

where α()(n) = n+ 1 and α( f

) = (|β

|), referring to

the f -indexed family of nested ﬁrst-order N-algebras

(N,β

) where β

() = f (1) and β

) = f (n). To

verify that ack(m,n) = (|α|)(m)(n) holds is left as an

exercise to the reader.

2.5 Course-of-Value Iteration

The categorial form of course-of-value (cov) iteration

(Uustalu and Vene, 1999) manages recursive function

results by considering, besides the functor T whose

initial algebra is the intended function domain, an ex-

tended functor T

, deﬁned as T

(X) = C ×T (X ) and

( f ) = id

×T ( f ), where C is the intended function

range. It can be understood as specifying the same al-

gebraic signature as T , but with an additional annota-

tion of a value from C. The intuition is that T speci-

ﬁes one level of structure of a function argument, and

pairs that with the function result. Then νT

the space of arbitrarily (even inﬁnitely) deeply nested

function arguments with results annotated at all levels

of nesting, and T (νT

) is the same, except that the

result for the top level is missing. The whole point

of cov iteration is to ﬁll that hole. The cov induction

principle can hence be put as follows:

For any set C and generator ϕ : T (νT

) → C,

there is a unique recursive function {|ϕ|}

µT → C.

This terse statement is much illuminated by an ex-

ample. For the Peano functor N, the extension N

looks very similar to the list functor L

, except that

the structure elements (C×) and (1+) are exchanged.

Consequently, the ﬁnal N

-coalgebra is almost the

same, except that the empty list is excluded, mak-

ing deconstruction total: νN

= C

+∞

= C

∪C

and

out

(c · ) = (c, ). The set N(νN

) can then be un-

derstood as re-including the empty list, denoted by

: N(νN

)

∼

∞

. We take the liberty to implicitly

conﬂate the two sets. Now let C = N and consider

the generator deﬁned as ϕ(ε) = 0, ϕ(a · ε) = 1 and

ϕ(a · b · ε) = a + b. This deﬁnes the Fibonacci func-

tion: ﬁb = {|ϕ|}

Compared with ordinary iteration, cov functions

often have multiple base cases, because their depen-

dence on results for subarguments may exceed the

nesting depth of the argument given. For the Fi-

bonacci function, each value depends on the two pre-

ceding ones, so base cases for arguments less than two

are required. The presentation can often be simpli-

ﬁed, and all base cases handled with a single math-

ematical object, by specifying a default, inﬁnite sur-

rogate history that is appended to any ﬁnite input to

the cov generator. Formally, the generator ϕ : C

∞

→ C

can be decomposed into an inﬁnite list h ∈ C

and a

generator without base cases

ϕ : C

→ C, such that

ϕ = ϕ ◦ append(h). For some cov iterations, espe-

cially time-invertible ones, there is a natural candi-

date for h that eliminates the base cases completely.

For the Fibonacci function, set ϕ(a · b · ) = a + b and

h = (+1) · (−1)···; surrogate elements after the sec-

ond are never used and hence arbitrary. Where no

such natural surrogate history can be found, add a dis-

tinguished “end” element to C and handle base cases

The exact relation of {|·|} to unique (co)algebra homo-

morphisms is technically involved and out of scope here.

ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

150

internally. In either case, we may assume without loss

of generality that cov iterators for the functor N are

functions of inﬁnite lists.

If a list functor L

is used in the ﬁrst place instead

of N, we obtain νL

= (C × A)

+∞

and L

(νL

) =

1 +



A × (C × A)

+∞



. Hence L

(νL

) corresponds to

ﬁnite or inﬁnite (C × A)-lists where the C-part of the

ﬁrst element, if any, is missing.

3 APPLICATION TO TFM

3.1 Single Trace: First-order Iteration

Traces in TFM are lists of interface events, most re-

cent events ﬁrst, each consisting of input and output

part, formalized as elements of sets I and O, respec-

tively. Hence complete traces are elements of (O ×I)

∗

in the ﬁnite case, or (O × I)

∞

more generally. A trace

function computes the output for the current input,

and may depend on previous outputs, but of course

not cyclically on the current output. Hence the argu-

ments to trace functions are incomplete traces: either

the trace is empty, or the ﬁrst (current) element has

no O-part. Hence trace function deﬁnitions ϕ match

the type of cov generators: ϕ : L

(νL

) → O, giving

rise to behavior functions {|ϕ|}

: I

∗

→ O that map a

complete history of inputs to the current output, com-

puting previous outputs recursively behind the scenes.

Figure 2 shows the cov version of the multiset

component description. The differences with respect

to Figure 1 are subtle but signiﬁcant: The type of bag

is now such that {|bag

= bag

is the desired trace

function. By virtue of the cov iteration operator, no

explicit recursion is necessary. The auxiliary function

(I) does not merely reduce the trace to be used as a

recursive argument to bag, but may directly retrieve

the results interspersed in the trace structure.

The ﬁrst-order cov iterative form of trace func-

tions already settles some of our research questions:

The existence and uniqueness of solutions of recur-

sive equations in TFM format are guaranteed. Fur-

thermore, the functions thus deﬁned are total and

computable, with no regard to efﬁciency, by straight-

forward evaluation algorithms. The distinction be-

tween factual and counterfactual pairings of input and

output is meaningful: Only the factual ones are rel-

evant to the function to be deﬁned, as evident from

the fact that the type O does not occur in the do-

main of {|ϕ|} : I

∗

→ O. From the perspective of

model parsimony, it seems best to disregard coun-

terfactual pairings completely, and adopt the skeptic

style, where past outputs are always computed recur-

sively and never retrieved.

3.2 Full Behavior: Higher-order

Iteration

The presentation of TFM in the preceding paragraphs

is adequate from a ﬁrst-order, relational viewpoint.

Nevertheless, it may be worthwhile to consider a dif-

ferent presentation from a higher-order, functional

viewpoint. Though technically much more ambitious,

it provides additional insights, in particular concern-

ing the relationship of TFM with other mathematical

models of history-dependent system behavior; see be-

low.

The ﬁrst step is to separate the concerns of in-

put and output, which are dealt with in logically dif-

ferent ways, and of the end of the trace, which are

all conﬂated when a trace function is given as a cov

generator ϕ : L

(νL

) → O. To that end, we borrow

a trick from practical stream programming, and em-

bed ﬁnite lists into inﬁnite lists (streams) by assum-

ing a distinguished element  that occurs only and

inﬁnitely often at the end of streams (cf. surrogate his-

tories above). Assume that I and O contain . A trace

function is then given as a generator ψ : I

×O

→ O

that takes inﬁnite backwards streams of inputs and

outputs, respectively, where the inputs do include the

current event but the outputs do not, and yields the

current output. Since such streams will play a major

role in the following, we call all inﬁnite backwards

streams histories, and additionally recent if they in-

clude the current value, and ancient if they do not.

It is easy to see that ψ is a natural rearrangement

with respect to ϕ (recall the end of section 2.5). No

information is lost; traces are merely split into inde-

pendent input and output histories, recent and ancient,

respectively, and padded with .

The second step is to observe also from the type

of the iteration {|ϕ|}

: I

∗

→ O that two pieces of

information are required simultaneously, namely the

number of steps (the length of the input trace) and the

actual input of events (the content of the input trace).

By systematic rearrangement, a cov iteration equiv-

alent to {|ϕ|}

but for the simpler functor N, thus

only being recursive in the number of steps, can be

constructed by “plugging” ψ into a generic construc-

tion F. The price for the simpler recursion signature

is a more complicated carrier set, namely the space

→ O] of functions taking recent input histories to

current outputs, called responses. Note that, since a

trace function generally depends not only on input

history but output history as well, it will be repre-

sented by a different response for every point in time

marked by an event.

Higher-order cov iteration will be used to con-

struct a recursive list of responses. Like in the Ack-

TheRecursionSchemeoftheTraceFunctionMethod

151

ermann example, we shall make use of the natu-

ral one-to-one correspondence between functions of

types X ×Y → Z and X → [Y → Z], respectively. As

customary, we write curry( f ) for the invertible trans-

lation of a function f from the former to the latter

type. The generic construction F is speciﬁed in terms

of two auxiliary operations.

The ﬁrst auxiliary operation has the type F

: Q →

O × Q where Q = [I

→ O]

× I

. It maps pairs of

recent response and input histories to triples of the

corresponding current output, and ancient response

and input histories; formally: F

(r · R, i · I) =



r(i ·

I), (R, I)



. Note that (Q,F

) has the form of a K

coalgebra. The unique homomorphism [(F

)]

: Q →

maps pairs of (recent/ancient) response and in-

put histories to the corresponding output history. The

higher-order function curry



[(F

)]



: [I

→ O]

→

→ O

] then maps response histories to the corre-

sponding black-box behavior of the system, namely

functions from input history to output history.

The second auxiliary operation has the type

(ψ) : [I

→ O

] × I

→ O. It maps pairs of pre-

vious black-box behavior and recent input history to

current output, by passing recent input history and an-

cient output history, obtained by applying the behav-

ior to ancient input history, to the generator ψ; for-

mally: F

(ψ)(b,i · I) = ψ



i·I,b(I)



. The higher-order

function curry



(ψ)



: [I

→ O

] → [I

→ O] then

maps the previous black-box behavior to the current

response.

In synopsis the composition F(ψ) =

curry



(ψ)



◦ curry



[(F

)]



maps ancient re-

sponse histories to the current response. It can

be proven equivalent with the ﬁrst-order variant:

{|ϕ|}

◦take(n) = {|F(ψ) ◦ pad|}

(n), where take(n)

discards all but the ﬁrst n elements of history.

Figure 3 shows the higher-order cov version of the

multiset example.

The higher-order cov iteration form reveals a strik-

ing similarity (that has not been noticed before) be-

tween TFM and a class of stochastic models of sys-

tem behavior that is immensely popular in time se-

ries modeling: The auto-regressive moving average

(ARMA) models (Box and Jenkins, 1970) have a

common vector space (most often simply real num-

bers) as both input and output, and calculate current

output as a linear combination of recent input and an-

cient output history. Thus they subsume the Fibonacci

example (where the linear combination is simply ad-

dition of the two preceding outputs) and are in turn

subsumed by TFM (where the linearity assumption is

lifted).

ARMA models can be used directly as ﬁlters

that add auto-correlation to a signal in a controlled

way. Or they can be used inversely, estimating the

linear coefﬁcients from given output, assuming that

the input is purely random and of minimal variance.

ARMA models may handle input and output histories

differently: The most common case are models that

have nonzero coefﬁcients for the p most recent out-

puts and q most recent inputs, respectively, for ﬁnite

ranks p and q. But models with genuinely inﬁnite de-

pendence on past outputs, effected by fractional dif-

ferencing (Granger and Joyeux, 1980), are popular

in economical and ecological applications (Montanari

et al., 1997), because they exhibit the fractal proper-

ties often apparent in data from these areas.

4 TOWARDS

IMPLEMENTATIONS

Structured total recursion schemes establish the well-

deﬁnedness, uniqueness, totality and general com-

putability of a function. Naive function evaluation,

however, may not be appropriate to actually and efﬁ-

ciently compute values. Although the recursive form

of the Fibonacci function is featured in virtually ev-

ery textbook on recursive programming, it is widely

acknowledged that the recursive algorithm is not use-

ful at all in practice. A well-known equivalent algo-

rithm, that reduces complexity from exponential to

linear, can be given in terms of ordinary iteration or

a loop, and two state variables instead of one. Math-

ematically it is speciﬁed concisely by iterated matrix

multiplication: fﬁb(n) =







0 1

1 1







. The imple-

mentation in an actual programming language is left

as an exercise to the reader.

Some general questions arise: Is naive evaluation

is generally inefﬁcient for cov iterations? Can a more

efﬁcient, ordinary iteration be proven equivalent? If

so, can it be found automatically? The short answers

are, respectively: Yes, except for degenerate cases.

Yes, but it may have unbounded space requirements.

Yes, for certain disciplined patterns of recursive de-

pendency.

The theoretical results about implementations of

cov iteration discussed in the following section are

novel and being published in a companion arti-

cle (Tranc

on y Widemann, 2012).

4.1 Simulation by State Systems

For the discussion of simulation of cov iteration by

state systems and ordinary iteration, we consider the

functor N only for simplicity. The results then apply

ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

152

to the higher-order cov representation of TFM. The

function space lurking in the carrier set is not a tech-

nical problem: Object-level functions can either be

represented intensionally by code (there is no need,

for instance, to decide equality), or eliminated by de-

functionalization (Reynolds, 1972), a standard tech-

nique to transform functional program expressions to

ﬁrst order. Similar results could be achieved in prin-

ciple for the functor L

to handle the ﬁrst-order cov

representation of TFM directly.

Fix a function range C. A C-state system is a triple

(S, σ, τ), where S is a state space, σ : N(νN

) → S

is called the abstraction function and τ : C × S → S

is called the transition function. It is called an epi-

state system

if and only if σ is surjective. A state

system is said to factor a cov generator ϕ : N(νN

) →

C if and only if two conditions hold: Firstly there is a

function

ϕ : S → C such that ϕ =

ϕ ◦ σ. For epi-state

systems,

ϕ is determined uniquely. A state system

can correctly simulate the cov iteration of a generator

only if this condition holds; otherwise the state is too

coarse to make all the relevant distinctions. Secondly,

it is required that τ



ϕ(),σ()



= σ



ϕ()· 



(note the

similarity to homomorphism properties).

Whereas the ﬁrst factoring condition is clearly

necessary for simulation, it can be shown that adding

the second one is sufﬁcient. If the state system

(S, σ, τ) factors the generator ϕ, then an ordinary it-

eration can be constructed from σ, τ and

ϕ that sim-

ulates {|ϕ|}: There is a N-algebra (C × S, ρ) such

that for every  ∈ N

(νN

) there is some ﬁnal state

s! such that (|ρ|) =



{|ϕ|}(),s!



. The operation ρ

is most concisely given in two parts ρ = ρ

◦ ρ

with ρ

() = σ(ε) and ρ



(c,s)



= s, and ρ

(s) =



ϕ(s),τ(

ϕ(s),s)



. The proof is too technical to be re-

peated here.

4.2 Regular Course-of-Value Iteration

A cov generator ϕ is called k-regular, for some natu-

ral number k ≥ 0, if and only if it is determined com-

pletely by exactly k previous results; formally, there

is a surrogate history h ∈ C

and

ϕ : C

→ C such

that ϕ =

ϕ ◦ take(k) ◦ append(h).

Then {|ϕ|} can be

simulated on constant space, namely with a ﬁrst-in-

ﬁrst-out buffer of k elements of C as state: the state

system (C

,σ,τ) with σ = take(k) ◦ append(h) and



,(c

,.. . , c

)



= (c

,.. . , c

k−1

) satisﬁes the condi-

tions given above, with

ϕ =

ϕ. The proof involves

From the categorial notion of epimorphisms, which co-

incide with surjective functions in the category of sets.

Note that

ϕ is a function of ﬁnite tuples of uniform

length, and hence the opposite of the “inﬁnitarized” ϕ dis-

cussed above.

straightforward manipulations of the iterative func-

tions take and append and is left as an exercise to the

reader. A real implementation would probably use a

ring buffer, where the elements are not shifted, but re-

main stationary and are addressed modulo k. This is

easily shown to be equivalent to the former represen-

tation.

The Fibonacci function is regular: set k = 2 and

h = (+1, −1) to retrieve the efﬁcient iterative algo-

rithm, albeit in the slightly modiﬁed form fﬁb(n) =







0 1

1 1





−1



. TFM descriptions that refer stati-

cally to the k most recent events are k-regular as well.

TFM descriptions that refer to the most recent event

fulﬁlling some predicate require more complicated

state spaces. In the multiset example, an X-indexed

family of the most recent multiplicity per element

would do. Of course, a good programmer would ob-

viously come up with an implementation as an array

or hashtable. But it can also be inferred automatically

from the recursive access pattern. Similar state con-

structions can be given for many of access patterns

of TFM, and organized in a “compiler” that realizes

trace functions as ordinary iterations.

4.3 Category of Implementations

The algebras for a functor T and their homomor-

phisms constitute a category. The initial T -algebra

is just the initial object in that category, and serves

as a generic model of syntax with respect to the sig-

nature T . Dually, the T -coalgebras and their ho-

momorphisms constitute another category. The ﬁnal

T -coalgebra is the ﬁnal object in that category, and

serves as a generic model of semantics with respect

to the signature T . This duality has been used, for

instance, to give structured semantics to abstract data

types (Erwig, 1998).

The state systems that factor a ﬁxed cov genera-

tor ϕ can also be organized as a category, with both

initial and ﬁnal objects, and analogous interpretations

as syntax and semantics, respectively. Abbreviate

the function which the second factoring condition is

about as δ() = τ



ϕ(),σ()



. A homomorphism be-

tween two state systems (S

,σ

,τ

) and (S

,σ

,τ

both factoring a common ϕ, is a function h : S

→ S

such that h ◦σ

= σ

and h ◦δ

= δ

The initial object in the category of ϕ-factoring

state systems is the trivial state system



N(νN

) =

∞

,id,(·)



. It qualiﬁes as purely syntactical: His-

tories are taken at face value, no abstraction occurs,

and the state transition merely accumulates results.

In category jargon, this yields a double coslice cate-

gory.

TheRecursionSchemeoftheTraceFunctionMethod

153

It is also the most straightforward implementation,

and already avoids the exponential blowup incurred

by naive evaluation, since common recursive subcom-

putations are properly shared. But it clearly requires

ever-growing amounts of space, since potentially ir-

relevant historical data is never discarded, and may

hence not qualify as a viable implementation for all

purposes.

The ﬁnal object in the category of ϕ-factoring

state systems is the coimage system (Coim(ϕ),π,τ),

where Coim(ϕ) is the partitioning of N(νN

) into the

equivalence classes that are identiﬁed by ϕ, π is the

mapping of elements to their equivalence class, and

τ is a complicated construction that can be shown to

exist by the Axiom of Choice. This system qualiﬁes

as purely semantical: its states are canonical repre-

sentants for the relevant historical information, but no

hint as to their practical encoding is given. Hence

the ﬁnal system is impractical as an implementation,

as the equivalence classes can be hard and inefﬁcient

to construct, but it completely eliminates all redun-

dancy in historical information, and it is therefore the

ideal, minimal model of behavior real implementa-

tions should aspire to.

5 CONCLUSIONS

We have demonstrated that description in the style

of the trace function method, which take the form

of trace functions of a certain recursive form, are

amenable to structured total recursion theory. That

is, there is an alternative non-recursive form (a gen-

erator) that induces the desired recursive trace func-

tion as the unique solution of a certain homomor-

phism equation. This result should not be misread as

meaning TFM should be notated with generators. But

the mere check that a trace function could be gen-

erated in this way entails a number of beneﬁcial se-

mantic properties: First of all, generators are neces-

sarily consistent and complete, in the sense that one

and only one solution exists. The solution is also to-

tally deﬁned and computable, making a TFM speci-

ﬁcation executable, in the sense that straightforward

effective implementations exist, in the form of prim-

itively recursive total functional programs, or equiva-

lently a certain kind of loops, for the more imperative-

minded.

The particular recursion scheme we have identi-

ﬁed as adequate for TFM, namely course-of-value it-

eration, elucidates the role of historical dependencies

in trace functions at a level of abstraction and with

a precision syntactic and ad-hoc algebraic arguments

cannot. On the one hand, the apparent dilemma dis-

cussed in section 1.1 can be resolved: References

to recursive function results and to stored outputs in

traces are redundant, but not in a logical conﬂict.

The cov iteration approach separates the two cleanly;

stored values are retrieved in the generator view on

a trace function, and map consistently to recursive

results in the induced trace function. Counterfac-

tual events are guaranteed to be functionally irrele-

vant. On the other hand, TFM can be related precisely

to established history-aware modeling paradigms, for

instance the ARMA approach, considered state of

the art for data-driven modeling in diverse empirical

ﬁelds such as economics and hydrology.

Calculating implementations for TFM description

that are practically and immediately usable as simula-

tors, test oracles or prototypes has so far posed difﬁ-

culties. We have demonstrated that most of the com-

plexity is due to the recursion scheme. Dealing with

that complexity in the appropriate theoretical frame-

work is certainly going to help. Finding adequate iter-

ative implementations can be understood theoretically

as movement along the initial–ﬁnal axis of factor-

ing state systems, driven by a tradeoff between cheap

state encoding (initial extreme) and strong compres-

sion of information (ﬁnal extreme). From a more tool-

oriented viewpoint, the classiﬁcation of recursive ac-

cess patterns in TFM, with respect to the costs of the

associated state system features, could lead to a lay-

ered deﬁnition of TFM basic vocabulary, enabling a

conscious tradeoff between power of expression and

performance of automatically derived correct imple-

mentations.

bag

: I

∗

→ O I = P × X

() : I

∗

× X → I

∗

O = {0, . . . , B}

bag

(ε) = 0

bag



(cnt,x) · T



= bag

(T  x)

bag



(inc,x) · T



= min



bag

(T  x) + 1, B



bag



(dec,x) · T



= max



bag

(T  x) − 1, 0



bag



(clr, x) · T



= 0

ε  x = ε



(p,y) · T



 x =

(

(p,y) · T if x = y or p = clr

T  x if x 6= y and p 6= clr

Figure 1: TFM-style multiset, recursively.

ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

154

bag

: L

(νL

) → O I = P × X

(I) : I

∗

× X → O O = {0, . . . , B}

bag

(ε) = 0

bag



(cnt,x) ·T



= T I x

bag



(inc,x) ·T



= min



(T I x) + 1,B



bag



(dec,x) ·T



= max



(T I x) − 1,0



bag



(clr, x) · T



= 0

ε I x = 0





n,(p,y)



· T



I x =

(

n if x = y or p = clr

T I x if x 6= y and p 6= clr

Figure 2: TFM-style multiset, ﬁrst-order cov generator.

bag

: I

× O

→ O I = P × X

() : I

∗

× X → O O = {0, . . . , B}

bag

( · I,O) = 0

bag



(cnt,x) ·I, O



= (I, O)  x

bag



(inc,x) ·I, O



= min



((I, O)  x) + 1, B



bag



(dec,x) ·I, O



= max



((I, O)  x) − 1, 0



bag



(clr, x) · I,O



= 0

( · I,O)  x = 0 (I,  · O)  x = 0



(p,y) · I,

n · O



 x =

(

n if x = y or p = clr

(I, O)  x if x 6= y and p 6= clr

Figure 3: TFM-style multiset, higher-order cov generator.

ACKNOWLEDGEMENTS

Parts of this work have been performed at the Soft-

ware Quality Research Laboratory, University of

Limerick, Ireland, supported by Science Foundation

Ireland under Grants 01/P1.2/C009 and 03/CE3/1405.

REFERENCES

Baber, R. L., Parnas, D. L., Vilkomir, S. A., Harrison, P.,

and O’Connor, T. (2005). Disciplined methods of soft-

ware speciﬁcation: A case study. In ITCC (2), pages

428–437. IEEE Computer Society.

Bird, R. and de Moor, O. (1997). Algebra of Programming,

volume 100 of International Series in Computing Sci-

ence. Prentice Hall.

Bonfante, G. (2011). Course of value distinguishes the in-

tentionality of programming languages. In Proceed-

ings 2nd Symposium on Information and Communica-

tion Technology (SoICT ’11), pages 189–198. ACM.

Box, G. E. and Jenkins, G. M. (1970). Time series analysis:

Forecasting and control. Holden–Day, San Francisco.

Erwig, M. (1998). Categorical programming with abstract

data types. In Haeberer, A. M., editor, AMAST, num-

ber 1548 in Lecture Notes in Computer Science, pages

406–421. Springer.

Granger, C. W. J. and Joyeux, R. (1980). An introduction

to long-memory time series models and fractional dif-

ferencing. Journal of Time Series Analysis, 1:15–30.

Liu, Z., Parnas, D. L., and Tranc

on y Widemann, B. (2010).

Documenting and verifying systems assembled from

components. Front. Comput. Sci. China, 4(2):151–

161.

Meijer, E., Fokkinga, M. M., and Paterson, R. (1991). Func-

tional programming with bananas, lenses, envelopes

and barbed wire. In Hughes, J., editor, Proceedings

5th International Conference on Functional Program-

ming Languages and Computer Architecture (FPCA

1991), number 523 in Lecture Notes in Computer Sci-

ence, pages 124–144. Springer.

Montanari, A., Rosso, R., and Taqqu, M. S. (1997). Frac-

tionally differenced arima models applied to hydro-

logic time series: Identiﬁcation, estimation, and simu-

lation. Water Resources Research, 33(5):1035–1044.

Parnas, D. L. (1993). Predicate logic for software engineer-

ing. IEEE Trans. Softw. Eng., 19:856–862.

Parnas, D. L. (2009). Document based rational software

development. Knowledge-Based Systems, 22(3):132–

141.

Quinn, C., Vilkomir, S. A., Parnas, D. L., and Kostic, S.

(2006). Speciﬁcation of software component require-

ments using the trace function method. In Proceed-

ings International Conference on Software Engineer-

ing Advances (ICSEA 2006), page 50. IEEE Computer

Society.

Reynolds, J. (1972). Deﬁnitional interpreters for higher-

order programming languages. In Proceedings of

the ACM Annual Conference, pages 717–740, Boston,

Massachusetts.

Tranc

on y Widemann, B. (2012). State-based simulation of

linear course-of-value iteration. In Proceedings 11th

International Workshop on Coalgebraic Methods in

Computer Science (CMCS 2012). Short contribution.

Tranc

on y Widemann, B. and Parnas, D. L. (2008). Tabu-

lar expressions and total functional programming. In

Chitil, O., Horv

ath, Z., and Zs

ok, V., editors, Imple-

mentation and Application of Functional Languages

(IFL 2007), Revised Selected Papers, number 5083 in

Lecture Notes in Computer Science, pages 219–236.

Springer.

Turner, D. A. (2004). Total functional programming. Jour-

nal of Universal Computer Science, 10(7):751–768.

Uustalu, T. and Vene, V. (1999). Primitive (co)recursion and

course-of-value (co)iteration, categorically. Informat-

ica, 10(1):5–26.

TheRecursionSchemeoftheTraceFunctionMethod

155