The Recursion Scheme of the Trace Function Method
Baltasar Tranc
´
on y Widemann
Department of Computer Science, University of Bayreuth, 95440 Bayreuth, Germany
Keywords:
Formal Method, Trace Function Method, Executable Semantics, Recursion Theory, State System.
Abstract:
The Trace Function Method (TFM) is a fundamental approach to the description of system behavior for re-
quirements analysis, specification, and documentation. External behavior of systems or components is given in
mathematically direct form, but with full abstraction from internal state, by defining output at discrete interface
events as recursive functions of the complete history of previous interaction at the same interface, including
both input and output. In order to understand and evaluate the semantics of the notation, and in particular the
executable semantics, that is, the potential for automatic simulation and construction of prototype implemen-
tations from TFM descriptions, a recursion-theoretic analysis is given. It is demonstrated that a single run and
the full reactive behavior of a TFM description can be presented as instances of first-order and higher-order
course-of-value iteration, respectively. A simple sufficient condition for correct implementations of TFM de-
scriptions in terms of state systems is given. The spectrum of possible state-based implementations of a TFM
description, ranging from straightforward simulation to minimized state space, is explored. Implications for
semantically calculated and hence formally verifiable prototype implementations are summarized.
1 INTRODUCTION
The trace function method (TFM), due to Parnas, is a
fundamental approach to the description of system be-
havior. Its applications in software engineering range
from requirements analysis (partial descriptions of re-
quired behavior) (Quinn et al., 2006), via design spec-
ification (complete descriptions of required behavior)
(Baber et al., 2005; Liu et al., 2010) and simulation
(Tranc
´
on y Widemann and Parnas, 2008) to documen-
tation (descriptions of observed behavior at various
levels of completeness and abstraction). As such, it
is an integral part of the rational development process
envisaged by (Parnas, 2009).
TFM abstracts fully from the internal structure
and state of a system or component. All described
behavior consists of discrete events at an interface. In-
teraction on events is specified by valuations of inter-
face variables. Every interface variable is either input
(controlled by the environment) or output (controlled
by the system). The complete relevant history of be-
havior at the interface, organized as a list of events,
most recent event first
1
is called a trace. A trace func-
tion is a total function
2
, that maps traces to output
1
The original TFM style has the oldest event first. The
reversal reflects the recursive structure better and will be
justified in section 2.
values. The behavior of the system is described com-
pletely by giving a trace function for each output vari-
able of the interface.
A trace that contains both input and output serves
as precise documentation of a single instance of sys-
tem behavior, independently of the trace functions
that specify the general rules of behavior. A trace can
be validated by comparing recorded outputs in the
trace with specified outputs from trace functions; or
it can be reconstructed from its input only, by recur-
sively filling in specified outputs, oldest events first.
Thus trace functions can act as either test oracles,
or simulators or prototype implementations, respec-
tively. In either case, there is a logical redundancy
between historical records and abstract descriptions.
The present article is an explication of the mathemat-
ical consequences of that redundancy. It is necessary
to understand the way in which the redundancy is re-
solved, in order to give precise executable semantics
to TFM, and hence derive correct implementations in
a formally satisfactory manner.
2
There is a relational extension of TFM that uses trace
relations to describe nondeterministic behavior, which is not
discussed here; the present arguments may be extended to
cover the relational case nevertheless.
146
Trancón y Widemann B..
The Recursion Scheme of the Trace Function Method.
DOI: 10.5220/0003994501460155
In Proceedings of the 7th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE-2012), pages 146-155
ISBN: 978-989-8565-13-6
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
1.1 The Dilemma of Historical Records
Formally, output values of past events are recorded
in a trace, even though they are redundant and can
in principle be calculated from the output trace func-
tions. That opens the syntactic possibility to have ar-
bitrary pairings of input and output in trace events,
not merely those that can actually be produced by the
system and make up the graphs of the output trace
functions. In philosophical terms, we may speak of
pairings that arise from the given output functions as
factual, and of those that do not as counterfactual.
Then the question arises how to deal with the distinc-
tion when defining an output function: Should one as-
sume the output values recorded in the trace to be fac-
tual, and retrieve them trustfully for subsequent com-
putations? Or should one rather give a “skeptic” defi-
nition that calculates previous outputs recursively? Is
there a semantic difference? Is there a difference in
terms of system implementation effort and efficiency?
The purpose of the present article is to use established
mathematical theory to answer these questions.
1.2 Motivating Example and Discussion
As a running example, consider the case of a compo-
nent with finite multiset functionality, keeping track
of multiplicities of elements of some finite set X. The
component supports four operations, namely count-
ing the multiplicity of a given element, increasing the
multiplicity of a given element by one up to an up-
per bound B, decreasing the multiplicity of a given
element by one if present, and setting the multiplic-
ity of all elements simultaneously to zero. Thus the
input part of each event can be specified by a pair
(p,x) P × X, where P = {cnt,inc,dec,clr}, where
x is irrelevant for p = clr. The initial multiplicity of
any element is zero. The output part of each event
is an integer between zero and B. The TFM descrip-
tion of the component is depicted in Figure 1.
3
The
trace function takes the “skeptic” approach discussed
above, and assumes only input values are present in
a trace. The variable T ranges over traces. The con-
stant ε denotes the empty trace; an event prepended to
a trace is denoted as e · T (cf. section 2.3). The auxil-
iary operation T x removes from a trace T the prefix
of all events not concerning element x. Thus the result
is always a sublist, or tail, of T.
The trace function bag depends recursively on
its own result for some strictly reduced arguments,
although the amount of reduction depends nontriv-
ially and dynamically on the pattern of usage. But
3
Figures are grouped on the last page for comparison.
apart from the recursion scheme, only very elemen-
tary mathematics are used.
4
Intuitively, there should
be a unique function bag that solves these recursive
equations, and there should be a straightforward algo-
rithm which effectively computes that function, per-
haps not with optimal efficiency, but viable as a sim-
ulation, rapid prototype or test oracle.
1.3 Enter Recursion Theory
Since the expressive power of TFM comes from the
way past outputs are used recursively in the compu-
tation of present outputs, it seems only fair to turn
to recursion theory for the analysis of the method.
We shall demonstrate that the adequate scheme of re-
cursive function definition is known theoretically as
course-of-value (cov) iteration. The phrase “course
of value”, or “course of values”, can be traced back to
the works of Frege, where it is roughly synonymous
with “extension”.
The difference between cov iteration and the more
familiar scheme of primitive recursion is illustrated
by analogy with proof by induction: In order to prove
P(n) for all natural numbers n, ordinary complete in-
duction uses a step of the form P(n) P(n +1) plus a
base case, often P(1) or P(0). An alternative, equiva-
lent method uses a step
P(k) for all k < n
P(n)
instead. This method is often more convenient to
use, and generalizes to transfinite induction, where
it is called the Noetherian induction scheme. Prim-
itive recursion is similar to ordinary induction, in us-
ing just a single preceding instance to infer the next,
whereas cov iteration is free to use all preceding in-
stances. Analogous arguments regarding (unchanged)
absolute power and (improved) convenience apply.
The use of cov iteration as a theoretical tool for
“intentional” program analysis has been proposed re-
cently (Bonfante, 2011), although we currently re-
serve judgement on the relevance for our own investi-
gations.
Note that the auxiliary function is also recur-
sive, but of a simpler form that will be formalized
as ordinary iteration below. The presence of recur-
sion also in the way a trace function accesses events
is responsible for the complexity of the multiset ex-
ample. TFM employs a variety of access operations,
ranging from static (splitting a trace into most re-
cent event and rest), via semi-dynamic (splitting at
the most recent event from a pre-determined set) to
fully dynamic (splitting at the most recent event that
satisfies an arbitrary relation with current input; fil-
tering certain events from a trace). The choice of ac-
4
The reader is indeed invited to try and come up with a
simpler, precisely equivalent description.
TheRecursionSchemeoftheTraceFunctionMethod
147
cess operations is made on a pragmatic, ad-hoc basis.
Below we shall give results that indicate vastly differ-
ent implementation costs for different classes of ac-
cess. Hence it may be prudent, and in accordance with
the principle of Occam’s Razor, to classify access op-
erations accordingly, and to prefer light-weight over
heavy-weight ones where applicable.
2 STRUCTURED RECURSION
THEORY
The classical theory of recursive functions comes in
two subtly different variants that must not be con-
funded. The theory of µ-recursive functions deals
with partial functions for which computation may di-
verge. The theory of primitive recursive functions
deals with total computable functions, a strict subset
of the former. Since µ-recursion is Turing-complete,
the question whether a µ-recursive function is defined
for an argument is undecidable. Hence they are a
model of computation rather than of abstract behav-
ioral description. Undecidability, and partial func-
tions that do not terminate, are unsuitable for exe-
cutable semantics of TFM, where a two-valued predi-
cate logic that may speak about definedness is used in
function definitions (Parnas, 1993), but descriptions
must be decidable in order to be useful.
Turner has made a similar remark with regard to
the algebraic analysis of functional programs:
The existing model of functional program-
ming, although elegant and powerful, is com-
promised to a greater extent than is commonly
recognised by the presence of partial func-
tions. (Turner, 2004)
As a corollary, an implementation of TFM in a
Turing-complete (partial) functional language such as
ML is not a simple matter of writing down the equa-
tions in a different syntax. The mismatch between
partial and total functions pervades all formal reason-
ing, and one is hardly better off in terms of correct-
ness than, say, with a C++ implementation. When
full-scale verification of closed programs is out of
the question, such as in the proposed agile deriva-
tion of oracles, simulators and prototypes, a disci-
plined approach to code generation is needed. We
shall demostrate that appropriate constraints can be
calculated from recursion-theoretic investigations; the
choice of the actual back-end language is secondary.
Turner’s conclusion is to focus on the special case
of total recursive functions by way of theoretically
informed, disciplined function definition styles. Un-
fortunately, primitive recursion on the natural num-
bers, the subject of the classical total theory, is a
natural format for a small class of functions only.
Some other functions can be encoded in obfuscated
ways such as G
¨
odel coding, while many cannot, the
most famous example being the Ackermann function.
Hence the strengthening of both recursion schemes
and datatypes has been given much attention. An im-
portant case in point is the Bird–Meertens formalism
(Bird and de Moor, 1997), also known derogatorily as
Squiggol, that focuses on recursion schemes for list-
like datatypes. It is a useful tool in the analysis and
verification of recursive functional algorithms, and in
the calculation of correct-by-construction functional
programs. As such it serves as the role model for our
present investigations of TFM.
A fairly general theory of recursive functions with
a recursion scheme able to accommodate TFM has
been given in terms of universal categorial algebra
and coalgebra (Uustalu and Vene, 1999). Since only a
special case is needed for the present discussion, the
relevant notions are summarized in this section in a
less general, but more accessible form. No knowl-
edge of category theory is presupposed. Novel and
TFM-specific results are given in the following sec-
tions.
Universal algebra, the mathematical foundation of
algebraic specification, is based on three notions:
a signature introduces operations with specified
argument numbers and types,
algebras are set-theoretic models of signatures,
realizing the operations as functions on some car-
rier set,
homomorphisms are functions between the carrier
sets of two algebras that are compatible with the
realizations of operations in either algebra.
It is quite natural to represent universal algebra
in category theory, due to the observation that signa-
tures and homomorphisms can be defined simultane-
ously as a functor on the category of sets. Such a
functor is a mapping T that takes any set A to an-
other set T (A) and each total function f : A B
to another total function T ( f ) : T (A) T (B), with
the side conditions that identical functions are taken
to identical functions, and compositions to composi-
tions: T (id
A
) = id
T (A)
and T (g f ) = T (g) T ( f ),
where (g f )(x) = g
f (x)
.
2.1 Algebras and Iteration
For a functor T that expresses a signature, the T -
algebras are pairs (A, α) of a carrier set A and a com-
bined realization of operations α : T (A) A. A T-
algebra homomorphism between two algebras (A, α)
ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering
148
and (B, β) is a function h : A B such that h α =
β T (h). The well-known result that the term alge-
bra of a signature has a unique homomorphism to
any other algebra, understood as bottom-up term eval-
uation, is rephrased categorially as an initial object
in the category of T-algebras: There is a T-algebra
(µT, in
T
) such that a unique homomorphism (|α|)
T
to
any other T -algebra (A,α) can be found.
5 6
Lambek’s
Lemma states that in
T
is bijective, that is µT is a fix-
point of the domain equation T(X)
=
X, and in fact
the least one.
As our first example, consider the Peano signature
of natural numbers. It is traditionally given as two
operations zero (nullary) and succ (unary). The same
can be expressed by a functor N that takes a set A to
the set 1 + A, the disjoint union of a singleton set 1
and A. We write for the injection of the element of
1 and a
0
for the injection of an element a A. The
function part of the functor is given by N( f )() =
and N( f )(x
0
) = f (x)
0
.
The obvious interpretation as natural numbers
with the number zero and the successor function
can be turned into an initial N-algebra, by virtue of
Peano’s axioms: µN = N, in() = 0 and in(n
0
) = n +
1. For any N-algebra (A,α), the function α : 1 +A
A can be decomposed into a constant z A such that
α() = z, and a function s : A A such that α(a
0
) =
s(a). The unique homomorphism (|α|) : N A satis-
fies (|α|)(0) = (|α|)
in()
= α
N(|α|)()
= α() =
z, and (|α|)(n + 1) = (|α|)
in(n
0
)
= α
N(|α|)(n
0
)
=
α
(|α|)(n)
0
= s
(|α|)(n)
. It can then be shown
that (|α|) computes the iteration of s starting from
z: (|α|)(n) = s
n
(z). For instance, for the N-algebra
(N,α) with α() = 1 and α(n
0
) = 2n, the homomor-
phism (|α|) computes the powers of two.
In imperative programming languages, iterations
feature as the semantics of certain side-effect free
loops: z, s and n denote the initial state, loop body
effect (as a state update) and number of iterations, re-
spectively. The intended result is the final state.
2.2 Coalgebras and Coiteration
As usual in category theory, the dual of a construc-
tion, obtained by reversing arrows, is also studied.
The dual of a T -algebra is a T -coalgebra, a pair
(C, γ) of a carrier set C and a realization of opera-
tions γ : C T (C). A T-coalgebra homomorphism
h between T -coalgebras (C, γ) and (D, δ) is a func-
tion h : C D such that T (h) γ = δ h. The dual
5
The “banana” bracket is due to (Meijer et al., 1991).
6
Subscripts indicating the functor will be dropped where
no confusion arises.
of an initial algebra is a final coalgebra: There is a
T -coalgebra (νT, out
T
) such that a unique homomor-
phism [(γ)]
T
from any T -coalgebra (C, γ) can be found.
Dually to Lambek’s Lemma, out
T
is also bijective,
and νT the greatest fixpoint of T (X )
=
X.
The final coalgebra view leads to an alternative
interpretation of the Peano signature: N-coalgebras
with realizations of operations of the type γ : C
1 +C can be understood as partial functions γ : C 9
C, where a result of or n
0
denotes undefinedness
or the value n, respectively. The set N = N + {}
and the predecessor function can be turned into a fi-
nal N-coalgebra: νN = N, out(0) = , out(n + 1) =
n
0
and out() =
0
. The unique homomorphism
[(γ)] : C N satisfies [(γ)](x) = 0 if γ(x) = , and
[(γ)](x) = [(γ)]
γ(x)
+ 1 otherwise, with the special
case [(γ)](x) = if γ can be iterated indefinitely on
x. It can then be shown that [(γ)] computes the coiter-
ation of γ: [(γ)](c) = min{n N | γ
n+1
(c) = }, where
powers of γ are obtained by strict composition of par-
tial functions, and min = . For instance, the iter-
ated logarithm log
, as defined in algorithmic com-
plexity theory, is the coiteration of the N-coalgebra
(R,log), where the function logx is restricted to argu-
ments x > 1.
In imperative programming languages, coitera-
tions feature as the semantics of certain side-effect
free loops: The partial function γ denotes a combined
state update and loop post-condition, starting from
initial state c. The intended result is the number of
iterations until the condition fails, with the possibility
of non-termination.
2.3 From Numbers to Lists
The (finite or infinite) lists of elements from a set A
are traditionally structured using the operations nil
A
and cons
A
. We abbreviate nil
A
to ε and cons
A
(a,) to
a ·, respectively.
The corresponding functor is L
A
(X) = 1 + (A ×
X) on sets, and L
A
( f )() = and L
A
( f )
(a,)
0
=
a, f ()
0
on functions. The finite lists constitute
an initial algebra: νL
A
= A
with in
L
A
() = ε and
in
L
A
(a,)
0
= a ·. Dually, the finite and infinite lists
constitute a final coalgebra: νL
A
= A
= A
A
ω
with
out
L
A
(ε) = and out
L
A
(a ·) = (a,)
0
.
The similar functor K
A
(X) = A × X is not very
interesting from the algebraic viewpoint, because its
initial algebra is empty. By contrast, the infinite
lists constitute a final coalgebra: νK
A
= A
ω
with
out
K
A
(a ·) = (a,).
For numerous examples of iterative and coitera-
tive functions on lists, their analysis and synthesis, see
(Bird and de Moor, 1997).
TheRecursionSchemeoftheTraceFunctionMethod
149
2.4 Generalizations
Iteration can be understood as a technique for the in-
ductive definition of a function with initial algebra do-
main, (|α|)
T
: µT A, by giving a T -algebra (A, α)
that defines one recursive step of the function. In the
signature functor T , base and induction cases are dealt
with together. The technique can easily be seen to
subsume, for the Peano functor N, classical complete
induction. The defining form α(a) is allowed to use
only the recursive function results for immediate sub-
arguments, a 1 in the case of N.
Not all functions are conveniently presented in
this form, though: for instance, the factorial function
fac(n) requires both the recursive result fac(n 1)
and the original argument n, and the Fibonacci func-
tion fib(n) requires two recursive results, fib(n 1)
and fib(n 2). Consequently, more complicated pat-
terns of recursion than the one catered for by iteration
are also studied. Recursive functions that work like
fac are handled with the primitive recursion pattern
which, for the Peano functor, is the classical form of
total recursion theory. Recursive functions that work
like fib, or more generally, require recursive function
results for arbitrary simpler arguments, are handled
with the course-of-value iteration pattern, which we
shall focus on for the following investigations.
Note that, in a setting with arbitrary datatypes de-
finable as functors, all total recursion schemes are es-
sentially equivalent in the sense that they can simu-
late each other in extended datatypes. Note also that
higher-order functions, operating on datatypes con-
taining computable functions, are strictly more pow-
erful than the first-order functions, which are dealt
with comprehensively in the classical theory via nat-
ural numbers and G
¨
odel encoding. For instance, the
archetypical example of a computable function that
is not primitively recursive in the classical sense, the
Ackermann function ack, turns out to be not just prim-
itively recursive, but even iterative, in a higher-order
setting: Consider the higher-order N-algebra ([N
N],α) over the space of iterative functions on N,
where α()(n) = n+ 1 and α( f
0
) = (|β
f
|), referring to
the f -indexed family of nested first-order N-algebras
(N,β
f
) where β
f
() = f (1) and β
f
(n
0
) = f (n). To
verify that ack(m,n) = (|α|)(m)(n) holds is left as an
exercise to the reader.
2.5 Course-of-Value Iteration
The categorial form of course-of-value (cov) iteration
(Uustalu and Vene, 1999) manages recursive function
results by considering, besides the functor T whose
initial algebra is the intended function domain, an ex-
tended functor T
C
, defined as T
C
(X) = C ×T (X ) and
T
C
( f ) = id
C
×T ( f ), where C is the intended function
range. It can be understood as specifying the same al-
gebraic signature as T , but with an additional annota-
tion of a value from C. The intuition is that T speci-
fies one level of structure of a function argument, and
T
C
pairs that with the function result. Then νT
C
is
the space of arbitrarily (even infinitely) deeply nested
function arguments with results annotated at all levels
of nesting, and T (νT
C
) is the same, except that the
result for the top level is missing. The whole point
of cov iteration is to fill that hole. The cov induction
principle can hence be put as follows:
For any set C and generator ϕ : T (νT
C
) C,
there is a unique recursive function {|ϕ|}
T
:
µT C.
7
This terse statement is much illuminated by an ex-
ample. For the Peano functor N, the extension N
C
looks very similar to the list functor L
C
, except that
the structure elements (C×) and (1+) are exchanged.
Consequently, the final N
C
-coalgebra is almost the
same, except that the empty list is excluded, mak-
ing deconstruction total: νN
C
= C
+
= C
+
C
ω
and
out
N
C
(c · ) = (c, ). The set N(νN
C
) can then be un-
derstood as re-including the empty list, denoted by
: N(νN
C
)
=
C
. We take the liberty to implicitly
conflate the two sets. Now let C = N and consider
the generator defined as ϕ(ε) = 0, ϕ(a · ε) = 1 and
ϕ(a · b · ε) = a + b. This defines the Fibonacci func-
tion: fib = {|ϕ|}
N
.
Compared with ordinary iteration, cov functions
often have multiple base cases, because their depen-
dence on results for subarguments may exceed the
nesting depth of the argument given. For the Fi-
bonacci function, each value depends on the two pre-
ceding ones, so base cases for arguments less than two
are required. The presentation can often be simpli-
fied, and all base cases handled with a single math-
ematical object, by specifying a default, infinite sur-
rogate history that is appended to any finite input to
the cov generator. Formally, the generator ϕ : C
C
can be decomposed into an infinite list h C
ω
and a
generator without base cases
ϕ : C
ω
C, such that
ϕ = ϕ append(h). For some cov iterations, espe-
cially time-invertible ones, there is a natural candi-
date for h that eliminates the base cases completely.
For the Fibonacci function, set ϕ(a · b · ) = a + b and
h = (+1) · (1)···; surrogate elements after the sec-
ond are never used and hence arbitrary. Where no
such natural surrogate history can be found, add a dis-
tinguished “end” element to C and handle base cases
7
The exact relation of {|·|} to unique (co)algebra homo-
morphisms is technically involved and out of scope here.
ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering
150
internally. In either case, we may assume without loss
of generality that cov iterators for the functor N are
functions of infinite lists.
If a list functor L
A
is used in the first place instead
of N, we obtain νL
C
A
= (C × A)
+
and L
A
(νL
C
A
) =
1 +
A × (C × A)
+
. Hence L
A
(νL
C
A
) corresponds to
finite or infinite (C × A)-lists where the C-part of the
first element, if any, is missing.
3 APPLICATION TO TFM
3.1 Single Trace: First-order Iteration
Traces in TFM are lists of interface events, most re-
cent events first, each consisting of input and output
part, formalized as elements of sets I and O, respec-
tively. Hence complete traces are elements of (O ×I)
in the finite case, or (O × I)
more generally. A trace
function computes the output for the current input,
and may depend on previous outputs, but of course
not cyclically on the current output. Hence the argu-
ments to trace functions are incomplete traces: either
the trace is empty, or the first (current) element has
no O-part. Hence trace function definitions ϕ match
the type of cov generators: ϕ : L
I
(νL
O
I
) O, giving
rise to behavior functions {|ϕ|}
L
I
: I
O that map a
complete history of inputs to the current output, com-
puting previous outputs recursively behind the scenes.
Figure 2 shows the cov version of the multiset
component description. The differences with respect
to Figure 1 are subtle but significant: The type of bag
i
is now such that {|bag
i
|}
L
I
= bag
r
is the desired trace
function. By virtue of the cov iteration operator, no
explicit recursion is necessary. The auxiliary function
(I) does not merely reduce the trace to be used as a
recursive argument to bag, but may directly retrieve
the results interspersed in the trace structure.
The first-order cov iterative form of trace func-
tions already settles some of our research questions:
The existence and uniqueness of solutions of recur-
sive equations in TFM format are guaranteed. Fur-
thermore, the functions thus defined are total and
computable, with no regard to efficiency, by straight-
forward evaluation algorithms. The distinction be-
tween factual and counterfactual pairings of input and
output is meaningful: Only the factual ones are rel-
evant to the function to be defined, as evident from
the fact that the type O does not occur in the do-
main of {|ϕ|} : I
O. From the perspective of
model parsimony, it seems best to disregard coun-
terfactual pairings completely, and adopt the skeptic
style, where past outputs are always computed recur-
sively and never retrieved.
3.2 Full Behavior: Higher-order
Iteration
The presentation of TFM in the preceding paragraphs
is adequate from a first-order, relational viewpoint.
Nevertheless, it may be worthwhile to consider a dif-
ferent presentation from a higher-order, functional
viewpoint. Though technically much more ambitious,
it provides additional insights, in particular concern-
ing the relationship of TFM with other mathematical
models of history-dependent system behavior; see be-
low.
The first step is to separate the concerns of in-
put and output, which are dealt with in logically dif-
ferent ways, and of the end of the trace, which are
all conflated when a trace function is given as a cov
generator ϕ : L
I
(νL
O
I
) O. To that end, we borrow
a trick from practical stream programming, and em-
bed finite lists into infinite lists (streams) by assum-
ing a distinguished element that occurs only and
infinitely often at the end of streams (cf. surrogate his-
tories above). Assume that I and O contain . A trace
function is then given as a generator ψ : I
ω
×O
ω
O
that takes infinite backwards streams of inputs and
outputs, respectively, where the inputs do include the
current event but the outputs do not, and yields the
current output. Since such streams will play a major
role in the following, we call all infinite backwards
streams histories, and additionally recent if they in-
clude the current value, and ancient if they do not.
It is easy to see that ψ is a natural rearrangement
with respect to ϕ (recall the end of section 2.5). No
information is lost; traces are merely split into inde-
pendent input and output histories, recent and ancient,
respectively, and padded with .
The second step is to observe also from the type
of the iteration {|ϕ|}
L
I
: I
O that two pieces of
information are required simultaneously, namely the
number of steps (the length of the input trace) and the
actual input of events (the content of the input trace).
By systematic rearrangement, a cov iteration equiv-
alent to {|ϕ|}
L
I
but for the simpler functor N, thus
only being recursive in the number of steps, can be
constructed by “plugging” ψ into a generic construc-
tion F. The price for the simpler recursion signature
is a more complicated carrier set, namely the space
[I
ω
O] of functions taking recent input histories to
current outputs, called responses. Note that, since a
trace function generally depends not only on input
history but output history as well, it will be repre-
sented by a different response for every point in time
marked by an event.
Higher-order cov iteration will be used to con-
struct a recursive list of responses. Like in the Ack-
TheRecursionSchemeoftheTraceFunctionMethod
151
ermann example, we shall make use of the natu-
ral one-to-one correspondence between functions of
types X ×Y Z and X [Y Z], respectively. As
customary, we write curry( f ) for the invertible trans-
lation of a function f from the former to the latter
type. The generic construction F is specified in terms
of two auxiliary operations.
The first auxiliary operation has the type F
1
: Q
O × Q where Q = [I
ω
O]
ω
× I
ω
. It maps pairs of
recent response and input histories to triples of the
corresponding current output, and ancient response
and input histories; formally: F
1
(r · R, i · I) =
r(i ·
I), (R, I)
. Note that (Q,F
1
) has the form of a K
O
-
coalgebra. The unique homomorphism [(F
1
)]
K
O
: Q
O
ω
maps pairs of (recent/ancient) response and in-
put histories to the corresponding output history. The
higher-order function curry
[(F
1
)]
K
O
: [I
ω
O]
ω
[I
ω
O
ω
] then maps response histories to the corre-
sponding black-box behavior of the system, namely
functions from input history to output history.
The second auxiliary operation has the type
F
2
(ψ) : [I
ω
O
ω
] × I
ω
O. It maps pairs of pre-
vious black-box behavior and recent input history to
current output, by passing recent input history and an-
cient output history, obtained by applying the behav-
ior to ancient input history, to the generator ψ; for-
mally: F
2
(ψ)(b,i · I) = ψ
i·I,b(I)
. The higher-order
function curry
F
2
(ψ)
: [I
ω
O
ω
] [I
ω
O] then
maps the previous black-box behavior to the current
response.
In synopsis the composition F(ψ) =
curry
F
2
(ψ)
curry
[(F
1
)]
K
O
maps ancient re-
sponse histories to the current response. It can
be proven equivalent with the first-order variant:
{|ϕ|}
L
O
take(n) = {|F(ψ) pad|}
N
(n), where take(n)
discards all but the first n elements of history.
Figure 3 shows the higher-order cov version of the
multiset example.
The higher-order cov iteration form reveals a strik-
ing similarity (that has not been noticed before) be-
tween TFM and a class of stochastic models of sys-
tem behavior that is immensely popular in time se-
ries modeling: The auto-regressive moving average
(ARMA) models (Box and Jenkins, 1970) have a
common vector space (most often simply real num-
bers) as both input and output, and calculate current
output as a linear combination of recent input and an-
cient output history. Thus they subsume the Fibonacci
example (where the linear combination is simply ad-
dition of the two preceding outputs) and are in turn
subsumed by TFM (where the linearity assumption is
lifted).
ARMA models can be used directly as filters
that add auto-correlation to a signal in a controlled
way. Or they can be used inversely, estimating the
linear coefficients from given output, assuming that
the input is purely random and of minimal variance.
ARMA models may handle input and output histories
differently: The most common case are models that
have nonzero coefficients for the p most recent out-
puts and q most recent inputs, respectively, for finite
ranks p and q. But models with genuinely infinite de-
pendence on past outputs, effected by fractional dif-
ferencing (Granger and Joyeux, 1980), are popular
in economical and ecological applications (Montanari
et al., 1997), because they exhibit the fractal proper-
ties often apparent in data from these areas.
4 TOWARDS
IMPLEMENTATIONS
Structured total recursion schemes establish the well-
definedness, uniqueness, totality and general com-
putability of a function. Naive function evaluation,
however, may not be appropriate to actually and effi-
ciently compute values. Although the recursive form
of the Fibonacci function is featured in virtually ev-
ery textbook on recursive programming, it is widely
acknowledged that the recursive algorithm is not use-
ful at all in practice. A well-known equivalent algo-
rithm, that reduces complexity from exponential to
linear, can be given in terms of ordinary iteration or
a loop, and two state variables instead of one. Math-
ematically it is specified concisely by iterated matrix
multiplication: ffib(n) =
1
0
T
0 1
1 1
n
0
1
. The imple-
mentation in an actual programming language is left
as an exercise to the reader.
Some general questions arise: Is naive evaluation
is generally inefficient for cov iterations? Can a more
efficient, ordinary iteration be proven equivalent? If
so, can it be found automatically? The short answers
are, respectively: Yes, except for degenerate cases.
Yes, but it may have unbounded space requirements.
Yes, for certain disciplined patterns of recursive de-
pendency.
The theoretical results about implementations of
cov iteration discussed in the following section are
novel and being published in a companion arti-
cle (Tranc
´
on y Widemann, 2012).
4.1 Simulation by State Systems
For the discussion of simulation of cov iteration by
state systems and ordinary iteration, we consider the
functor N only for simplicity. The results then apply
ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering
152
to the higher-order cov representation of TFM. The
function space lurking in the carrier set is not a tech-
nical problem: Object-level functions can either be
represented intensionally by code (there is no need,
for instance, to decide equality), or eliminated by de-
functionalization (Reynolds, 1972), a standard tech-
nique to transform functional program expressions to
first order. Similar results could be achieved in prin-
ciple for the functor L
O
to handle the first-order cov
representation of TFM directly.
Fix a function range C. A C-state system is a triple
(S, σ, τ), where S is a state space, σ : N(νN
C
) S
is called the abstraction function and τ : C × S S
is called the transition function. It is called an epi-
state system
8
if and only if σ is surjective. A state
system is said to factor a cov generator ϕ : N(νN
C
)
C if and only if two conditions hold: Firstly there is a
function
e
ϕ : S C such that ϕ =
e
ϕ σ. For epi-state
systems,
e
ϕ is determined uniquely. A state system
can correctly simulate the cov iteration of a generator
only if this condition holds; otherwise the state is too
coarse to make all the relevant distinctions. Secondly,
it is required that τ
ϕ(),σ()
= σ
ϕ()·
(note the
similarity to homomorphism properties).
Whereas the first factoring condition is clearly
necessary for simulation, it can be shown that adding
the second one is sufficient. If the state system
(S, σ, τ) factors the generator ϕ, then an ordinary it-
eration can be constructed from σ, τ and
e
ϕ that sim-
ulates {|ϕ|}: There is a N-algebra (C × S, ρ) such
that for every N
C
(νN
C
) there is some final state
s! such that (|ρ|) =
{|ϕ|}(),s!
. The operation ρ
is most concisely given in two parts ρ = ρ
2
ρ
1
,
with ρ
1
() = σ(ε) and ρ
1
(c,s)
0
= s, and ρ
2
(s) =
e
ϕ(s),τ(
e
ϕ(s),s)
. The proof is too technical to be re-
peated here.
4.2 Regular Course-of-Value Iteration
A cov generator ϕ is called k-regular, for some natu-
ral number k 0, if and only if it is determined com-
pletely by exactly k previous results; formally, there
is a surrogate history h C
k
and
b
ϕ : C
k
C such
that ϕ =
b
ϕ take(k) append(h).
9
Then {|ϕ|} can be
simulated on constant space, namely with a first-in-
first-out buffer of k elements of C as state: the state
system (C
k
,σ,τ) with σ = take(k) append(h) and
τ
c
0
,(c
1
,.. . , c
k
)
= (c
0
,.. . , c
k1
) satisfies the condi-
tions given above, with
e
ϕ =
b
ϕ. The proof involves
8
From the categorial notion of epimorphisms, which co-
incide with surjective functions in the category of sets.
9
Note that
b
ϕ is a function of finite tuples of uniform
length, and hence the opposite of the “infinitarized” ϕ dis-
cussed above.
straightforward manipulations of the iterative func-
tions take and append and is left as an exercise to the
reader. A real implementation would probably use a
ring buffer, where the elements are not shifted, but re-
main stationary and are addressed modulo k. This is
easily shown to be equivalent to the former represen-
tation.
The Fibonacci function is regular: set k = 2 and
h = (+1, 1) to retrieve the efficient iterative algo-
rithm, albeit in the slightly modified form ffib(n) =
1
1
T
0 1
1 1
n
1
+1
. TFM descriptions that refer stati-
cally to the k most recent events are k-regular as well.
TFM descriptions that refer to the most recent event
fulfilling some predicate require more complicated
state spaces. In the multiset example, an X-indexed
family of the most recent multiplicity per element
would do. Of course, a good programmer would ob-
viously come up with an implementation as an array
or hashtable. But it can also be inferred automatically
from the recursive access pattern. Similar state con-
structions can be given for many of access patterns
of TFM, and organized in a “compiler” that realizes
trace functions as ordinary iterations.
4.3 Category of Implementations
The algebras for a functor T and their homomor-
phisms constitute a category. The initial T -algebra
is just the initial object in that category, and serves
as a generic model of syntax with respect to the sig-
nature T . Dually, the T -coalgebras and their ho-
momorphisms constitute another category. The final
T -coalgebra is the final object in that category, and
serves as a generic model of semantics with respect
to the signature T . This duality has been used, for
instance, to give structured semantics to abstract data
types (Erwig, 1998).
The state systems that factor a fixed cov genera-
tor ϕ can also be organized as a category, with both
initial and final objects, and analogous interpretations
as syntax and semantics, respectively. Abbreviate
the function which the second factoring condition is
about as δ() = τ
ϕ(),σ()
. A homomorphism be-
tween two state systems (S
1
,σ
1
,τ
1
) and (S
2
,σ
2
,τ
2
),
both factoring a common ϕ, is a function h : S
1
S
2
such that h σ
1
= σ
2
and h δ
1
= δ
2
.
10
The initial object in the category of ϕ-factoring
state systems is the trivial state system
N(νN
C
) =
C
,id,(·)
. It qualifies as purely syntactical: His-
tories are taken at face value, no abstraction occurs,
and the state transition merely accumulates results.
10
In category jargon, this yields a double coslice cate-
gory.
TheRecursionSchemeoftheTraceFunctionMethod
153
It is also the most straightforward implementation,
and already avoids the exponential blowup incurred
by naive evaluation, since common recursive subcom-
putations are properly shared. But it clearly requires
ever-growing amounts of space, since potentially ir-
relevant historical data is never discarded, and may
hence not qualify as a viable implementation for all
purposes.
The final object in the category of ϕ-factoring
state systems is the coimage system (Coim(ϕ),π,τ),
where Coim(ϕ) is the partitioning of N(νN
C
) into the
equivalence classes that are identified by ϕ, π is the
mapping of elements to their equivalence class, and
τ is a complicated construction that can be shown to
exist by the Axiom of Choice. This system qualifies
as purely semantical: its states are canonical repre-
sentants for the relevant historical information, but no
hint as to their practical encoding is given. Hence
the final system is impractical as an implementation,
as the equivalence classes can be hard and inefficient
to construct, but it completely eliminates all redun-
dancy in historical information, and it is therefore the
ideal, minimal model of behavior real implementa-
tions should aspire to.
5 CONCLUSIONS
We have demonstrated that description in the style
of the trace function method, which take the form
of trace functions of a certain recursive form, are
amenable to structured total recursion theory. That
is, there is an alternative non-recursive form (a gen-
erator) that induces the desired recursive trace func-
tion as the unique solution of a certain homomor-
phism equation. This result should not be misread as
meaning TFM should be notated with generators. But
the mere check that a trace function could be gen-
erated in this way entails a number of beneficial se-
mantic properties: First of all, generators are neces-
sarily consistent and complete, in the sense that one
and only one solution exists. The solution is also to-
tally defined and computable, making a TFM speci-
fication executable, in the sense that straightforward
effective implementations exist, in the form of prim-
itively recursive total functional programs, or equiva-
lently a certain kind of loops, for the more imperative-
minded.
The particular recursion scheme we have identi-
fied as adequate for TFM, namely course-of-value it-
eration, elucidates the role of historical dependencies
in trace functions at a level of abstraction and with
a precision syntactic and ad-hoc algebraic arguments
cannot. On the one hand, the apparent dilemma dis-
cussed in section 1.1 can be resolved: References
to recursive function results and to stored outputs in
traces are redundant, but not in a logical conflict.
The cov iteration approach separates the two cleanly;
stored values are retrieved in the generator view on
a trace function, and map consistently to recursive
results in the induced trace function. Counterfac-
tual events are guaranteed to be functionally irrele-
vant. On the other hand, TFM can be related precisely
to established history-aware modeling paradigms, for
instance the ARMA approach, considered state of
the art for data-driven modeling in diverse empirical
fields such as economics and hydrology.
Calculating implementations for TFM description
that are practically and immediately usable as simula-
tors, test oracles or prototypes has so far posed diffi-
culties. We have demonstrated that most of the com-
plexity is due to the recursion scheme. Dealing with
that complexity in the appropriate theoretical frame-
work is certainly going to help. Finding adequate iter-
ative implementations can be understood theoretically
as movement along the initial–final axis of factor-
ing state systems, driven by a tradeoff between cheap
state encoding (initial extreme) and strong compres-
sion of information (final extreme). From a more tool-
oriented viewpoint, the classification of recursive ac-
cess patterns in TFM, with respect to the costs of the
associated state system features, could lead to a lay-
ered definition of TFM basic vocabulary, enabling a
conscious tradeoff between power of expression and
performance of automatically derived correct imple-
mentations.
bag
r
: I
O I = P × X
() : I
× X I
O = {0, . . . , B}
bag
r
(ε) = 0
bag
r
(cnt,x) · T
= bag
r
(T x)
bag
r
(inc,x) · T
= min
bag
r
(T x) + 1, B
bag
r
(dec,x) · T
= max
bag
r
(T x) 1, 0
bag
r
(clr, x) · T
= 0
ε x = ε
(p,y) · T
x =
(
(p,y) · T if x = y or p = clr
T x if x 6= y and p 6= clr
Figure 1: TFM-style multiset, recursively.
ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering
154
bag
i
: L
I
(νL
O
I
) O I = P × X
(I) : I
× X O O = {0, . . . , B}
bag
i
(ε) = 0
bag
i
(cnt,x) ·T
= T I x
bag
i
(inc,x) ·T
= min
(T I x) + 1,B
bag
i
(dec,x) ·T
= max
(T I x) 1,0
bag
i
(clr, x) · T
= 0
ε I x = 0
n,(p,y)
· T
I x =
(
n if x = y or p = clr
T I x if x 6= y and p 6= clr
Figure 2: TFM-style multiset, first-order cov generator.
bag
h
: I
ω
× O
ω
O I = P × X
() : I
× X O O = {0, . . . , B}
bag
h
( · I,O) = 0
bag
h
(cnt,x) ·I, O
= (I, O) x
bag
h
(inc,x) ·I, O
= min
((I, O) x) + 1, B
bag
h
(dec,x) ·I, O
= max
((I, O) x) 1, 0
bag
h
(clr, x) · I,O
= 0
( · I,O) x = 0 (I, · O) x = 0
(p,y) · I,
n · O
x =
(
n if x = y or p = clr
(I, O) x if x 6= y and p 6= clr
Figure 3: TFM-style multiset, higher-order cov generator.
ACKNOWLEDGEMENTS
Parts of this work have been performed at the Soft-
ware Quality Research Laboratory, University of
Limerick, Ireland, supported by Science Foundation
Ireland under Grants 01/P1.2/C009 and 03/CE3/1405.
REFERENCES
Baber, R. L., Parnas, D. L., Vilkomir, S. A., Harrison, P.,
and O’Connor, T. (2005). Disciplined methods of soft-
ware specification: A case study. In ITCC (2), pages
428–437. IEEE Computer Society.
Bird, R. and de Moor, O. (1997). Algebra of Programming,
volume 100 of International Series in Computing Sci-
ence. Prentice Hall.
Bonfante, G. (2011). Course of value distinguishes the in-
tentionality of programming languages. In Proceed-
ings 2nd Symposium on Information and Communica-
tion Technology (SoICT ’11), pages 189–198. ACM.
Box, G. E. and Jenkins, G. M. (1970). Time series analysis:
Forecasting and control. Holden–Day, San Francisco.
Erwig, M. (1998). Categorical programming with abstract
data types. In Haeberer, A. M., editor, AMAST, num-
ber 1548 in Lecture Notes in Computer Science, pages
406–421. Springer.
Granger, C. W. J. and Joyeux, R. (1980). An introduction
to long-memory time series models and fractional dif-
ferencing. Journal of Time Series Analysis, 1:15–30.
Liu, Z., Parnas, D. L., and Tranc
´
on y Widemann, B. (2010).
Documenting and verifying systems assembled from
components. Front. Comput. Sci. China, 4(2):151–
161.
Meijer, E., Fokkinga, M. M., and Paterson, R. (1991). Func-
tional programming with bananas, lenses, envelopes
and barbed wire. In Hughes, J., editor, Proceedings
5th International Conference on Functional Program-
ming Languages and Computer Architecture (FPCA
1991), number 523 in Lecture Notes in Computer Sci-
ence, pages 124–144. Springer.
Montanari, A., Rosso, R., and Taqqu, M. S. (1997). Frac-
tionally differenced arima models applied to hydro-
logic time series: Identification, estimation, and simu-
lation. Water Resources Research, 33(5):1035–1044.
Parnas, D. L. (1993). Predicate logic for software engineer-
ing. IEEE Trans. Softw. Eng., 19:856–862.
Parnas, D. L. (2009). Document based rational software
development. Knowledge-Based Systems, 22(3):132–
141.
Quinn, C., Vilkomir, S. A., Parnas, D. L., and Kostic, S.
(2006). Specification of software component require-
ments using the trace function method. In Proceed-
ings International Conference on Software Engineer-
ing Advances (ICSEA 2006), page 50. IEEE Computer
Society.
Reynolds, J. (1972). Definitional interpreters for higher-
order programming languages. In Proceedings of
the ACM Annual Conference, pages 717–740, Boston,
Massachusetts.
Tranc
´
on y Widemann, B. (2012). State-based simulation of
linear course-of-value iteration. In Proceedings 11th
International Workshop on Coalgebraic Methods in
Computer Science (CMCS 2012). Short contribution.
Tranc
´
on y Widemann, B. and Parnas, D. L. (2008). Tabu-
lar expressions and total functional programming. In
Chitil, O., Horv
´
ath, Z., and Zs
´
ok, V., editors, Imple-
mentation and Application of Functional Languages
(IFL 2007), Revised Selected Papers, number 5083 in
Lecture Notes in Computer Science, pages 219–236.
Springer.
Turner, D. A. (2004). Total functional programming. Jour-
nal of Universal Computer Science, 10(7):751–768.
Uustalu, T. and Vene, V. (1999). Primitive (co)recursion and
course-of-value (co)iteration, categorically. Informat-
ica, 10(1):5–26.
TheRecursionSchemeoftheTraceFunctionMethod
155