MACHINE SYMBOL GROUNDING AND OPTIMIZATION
Oliver Kramer
International Computer Science Institute, Berkeley, U.S.A.
Keywords:
Autonomous agents, Symbol grounding, Zero semantical commitment condition, Machine learning, Interface
design, Optimization.
Abstract:
Autonomous systems gather high-dimensional sensorimotor data with their multimodal sensors. Symbol
grounding is about whether these systems can, based on this data, construct symbols that serve as a vehi-
cle for higher symbol-oriented cognitive processes. Machine learning and data mining techniques are geared
towards finding structures and input-output relations in this data by providing appropriate interface algorithms
that translate raw data into symbols. Can autonomous systems learn how to ground symbols in an unsuper-
vised way, only with a feedback on the level of higher objectives? A target-oriented optimization procedure
is suggested as a solution to the symbol grounding problem. It is demonstrated that the machine learning
perspective introduced in this paper is consistent with the philosophical perspective of constructivism. Inter-
face optimization offers a generic way to ground symbols in machine learning. The optimization perspective
is argued to be consistent with von Glasersfeld’s view of the world as a black box. A case study illustrates
technical details of the machine symbol grounding approach.
1 INTRODUCTION
The literature on artificial intelligence (AI) defines
“perception” in cognitive systems as the transduction
of subsymbolic data to symbols (e.g. (Russell and
Norvig, 2003)). Auditory, visual or tactile data from
various kinds of sense organs is subject to neural pat-
tern recognition processes, which reduce it to neuro-
physiological signals that our mind interprets as sym-
bols or schemes. The human visual system has often
referred to as an example for such a complex trans-
formation. Symbols are thought to be representations
of entities in the world, having a syntax of their own.
Even more importantly, symbols are supposed to be
grounded by their internal semantics. They allow cog-
nitive manipulations such as inference processes and
logical operations, which made AI researches come
to believe that thinking can be referred to as the ma-
nipulation of symbols, and therefore could be con-
sidered to be computations (Harnad, 1994). Cogni-
tion becomes implementation-independent, systemat-
ically interpretable symbol-manipulation.
However, how do we define symbols and their
meaning in artificial systems, e.g., for autonomous
robots? Which subsymbolic elements belong to the
set that defines a symbol, and with regard to cog-
nitive manipulations what is the interpretation of
a particular symbol? These questions are the focus
of the “symbolic grounding problem” (SGP) (Har-
nad, 1990), and the “chinese room argument” (Searle,
1980), both of which concentrate on the problem of
how the meaning and the interpretation of a symbol
is grounded in action. Several strategies have been
proposed to meet these challenges. For a thorough re-
view cf. (Taddeo and Floridi, 2005).
From my perspective, the definition of a sym-
bol depends on intrinsic structure in the perceived
data and on its interpretation, which is entirely of a
functional nature. “Functional” here means target-
oriented: the intention to achieve goals and the suc-
cess in solving problems must guide the formation of
meaning and thus the definition of symbols. Hence, it
seems reasonable to formulate the definition of sym-
bols as a target-oriented optimization problem, be-
cause optimal symbols and their interpretations will
yield optimal success. In many artificial systems,
symbols are defined by an interface algorithm that
maps sensory or sensorimotor data onto symbol to-
kens, detecting regularities and intrinsic structures in
perceived data. Optimizing a symbol with regard to
the success of cognitive operations means optimizing
the interface design. From this perspective the whole
learning process is self-organized, only grounded in
464
Kramer O..
MACHINE SYMBOL GROUNDING AND OPTIMIZATION.
DOI: 10.5220/0003274304640469
In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence (ICAART-2011), pages 464-469
ISBN: 978-989-8425-40-9
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)
some external feedback. This viewpoint is consis-
tent with Craik’s understanding of complex behaviors
and learning: “We should now have to conceive a ma-
chine capable of modification of its own mechanism
so as to establish that mechanism which was suc-
cessful in solving the problem at hand, and the sup-
pression of alternative mechanisms” (Craik, 1966).
The optimization model offers mechanisms that allow
to ground symbols with regards to external feedback
from problem solving processes.
While only observing regularities and invariances,
a cognitive system (or “agent”, for short) is able to
act and to predict without internalizing any form of
ontological reality. This machine learning motivated
point of view is related to constructivism, in which the
world remains a black box in the sense of Ernst von
Glasersfeld (Glasersfeld, 1987). From his point of
view, his experience, i.e., the high-dimensional infor-
mation perceived by sense organs, is the only contact
a cognitive system has with the ontological reality. It
organizes its “experience into viable representation of
a world, then one can consider that representation a
model, and the “outside reality” it claims to represent,
a black box. (Glasersfeld, 1979). A viable represen-
tation already implies a functional component. The
representation must fulfill a certain quality with re-
gard to the fulfillment of the agent’s target.
2 SYMBOLS AND THEIR
MEANING
Before we can discuss the problem SGP in greater
detail, let me review the current state of play. As
pointed out above, according to mainstream AI claims
that cognitive operations are carried out on a sym-
bolic level (Newell and Simon, 1976). In this view
I assume that an autonomous agent performs cogni-
tive operations with a symbolic algorithm, i.e., based
on an algorithm that operates on a symbolic level. An
agent is typically part of the actual or a virtual world.
It is situated in a real environment and this is referred
to as “embodied intelligence” (Pfeier and Iida, 2003).
An embodied agent should physically interact with its
environment and exploit the laws of physics in that
environment, in order to be deeply grounded in its
world. It is able to perceive its environment with var-
ious (e.g., visual or tactile) sensors that deliver high-
dimensional data, e.g., a visual system or tactile sen-
sors. These sensory information is the used to build
its cognitive structures.
In the following I assume that an artificial agent
uses an interface algorithm I that performs a map-
ping I : D S from a data sample d D to a sym-
bol s S, i.e., the I maps subsymbolic data from a
high-dimensional set D of input data onto a set of
symbols S. The set of symbols is subject to cogni-
tive manipulations A. The interface is the basis of
many approaches in engineering, and artificial intel-
ligence although not always explicitly stated. The
meaning of a symbol s S is based on its interpreta-
tion on the symbolic level. On the one hand symbols
are only tokens, which may be defined independent
of as their shape (Harnad, 1994). On the other hand,
the effect they have on the symbolic algorithm A can
be referred to as the meaning or interpretation of the
symbol. Formally, a symbolic algorithm A performs
(cognitive) operations on a set of symbols S, which is
then the basis of acting and decision making.
In this context Newell and Simon (Newell and Si-
mon, 1976) stated that “a physical symbol system has
the necessary and sufficient means for general intel-
ligent action”. Even if we assume this to be true
and if we have the means to implement these gen-
eral intelligent algorithms, the question of how we can
get a useful physical symbol system remains unan-
swered. How are symbols defined in this symbol sys-
tem, how do they get their meaning? Floridi empha-
sizes that the SGP is an important question in the phi-
losophy of information (Floridi, 2004). It describes
the problem of how words get assigned to meanings
and what meaning is. Related questions have been in-
tensively discussed over the last few decades (Harnad,
1987; Harnad, 1990). Harnad argues that symbols are
bound to a meaning independent of their shape (Har-
nad, 1990). This meaning-shape independence is an
indication that the ontological reality is not reflected
in the shape of a symbol and is consistent with. The
ability to organize the perceived input-ouput relations
is independent of the shape of a symbol. This can also
be observed in a lot of existing machine learning ap-
proaches for artificial agents.
While it may not be difficult to ground symbols
in one way or other, finding an answer to the ques-
tion of how an autonomous agent is able to solve this
task on its own thereby elaborating its own semantics
renders much more difficult. In biological systems,
genetic preconditions and the interaction with the en-
vironment and other autonomous agents seem to be
the only sources this elaboration is based on. There-
fore, the interpretation of symbols must be an intrinsic
process to the symbolic system itself without the need
for external influence. This process allows the agent
to construct a sort of “mental” representation that
increases its chances of survival in its environment.
Harnad derived three conditions from this assump-
tion: First, no semantic resources are preinstalled in
the autonomous agent (no innatism, or nativism re-
MACHINE SYMBOL GROUNDING AND OPTIMIZATION
465
spectively), second, semantic resources are not up-
loaded from outside (no externalism), and third, the
autonomous agent possesses its own means to ground
symbols (using sensors, actuators, computational ca-
pacities, syntactical and procedural resources, etc.)
(Harnad, 1990; Harnad, 2007). Taddeo and Floridi
called this the “zero semantical commitment condi-
tion” (Taddeo and Floridi, 2005). .
3 MACHINE LEARNING
INTERFACES
How does the interface algorithm I define the sym-
bolic system? In artificial systems it may be im-
plemented by any machine learning algorithm that
is transferring subsymbolic to symbolic representa-
tions. From the perspective of cognitive economy
and dimensionality reduction respectively, it makes
sense that |S| << |D|, i.e., the dimensionality of data
in D is high while the dimensionality of symbols is
low, in many cases one. Assigning unknown objects
to known concepts is known as classification, group-
ing similar objects into a cluster is known as clus-
tering, finding low-dimensional representations for
high-dimensional data is known as dimensionality re-
duction. Symbolic grounding is closely related to the
question that arises in machine learning of how to
map high-dimensional data to classes or clusters, in
particular because they are able to represent intrinsic
structures of perceived data, e.g., to detect regulari-
ties and invariances. The concept of labels, classes
or clusters in machine learning is very similar to the
concept of symbols.
In the past, many symbol grounding related work
exclusively concentrated on neural networks, see
(A. Cangelosi, 2002). However, alternative algo-
rithms used for machine learning and data mining are
well suited for these types of tasks, in particular be-
cause they find legitimacy in statistics. They are com-
plemented by runtime and convergence analyses in
computer science, and they have been proven to work
well in plenty of engineering applications. One may
assume that these techniques are also an excellent
basis for interface algorithms that perform a symbol
grounding oriented transformation of subsymbolic to
symbolic representations.
The classification task is strongly related to what
von Glasersfeld described as “equivalence and con-
tinuity”, the process of assimilation object-concept
relations, i.e., to shape a present experience to fit a
known sensorimotor scheme (Glasersfeld, 1979). To
recognize such constructs the agent has to abstract
from sensory elements that may vary or that may not
be available at some time. Glaserfeld’s “sensory ele-
ments” or “structures in the world” correspond to the
high-dimensional features in machine learning. Sim-
ilar arguments hold for the unsupervised counterpart
of classification, i.e., clustering, as I will demonstrate
in the following. By means of the machine learning
algorithms the agent organizes its “particles of expe-
rience”.
With regard to its intrinsic structure, classification,
clustering and dimension reduction yield a reasonable
discretization into meaningful symbols. The mapping
will be data driven and statistically guided. From the
point of view of statistics a data driven interface may
be the most “objective” interface. From the point of
view of the agent the question arises if statistical ob-
jectivity of his symbols is consistent with his needs.
A biological agent is defined by his neurophysiolog-
ical system, an artificial agent by his sensors and al-
gorithms, and perception is a process that depends on
the individual learning history. Consequently, the in-
duced bias is not consistent with statistical objectivity.
It is questionable if solely statistical objectivity leads
to goad-driven symbol grounding.
4 INTERFACE OPTIMIZATION
Feedback is one necessary means for the internal self-
organizing process of knowledge acquisition and or-
ganization. Sun calls this intrinsic intentionality when
he points out that symbols “are formed in relation
to the experience of agents, through their perceptual
/motor apparatuses, in their world and linked to their
goals and actions” (Sun, 2000). I assume that the
agent’s target is known and the fulfillment can be
measured. To bound the symbols and their subsym-
bolic correlate to meanings, the interface is optimized
with regard to the agent’s target. The design of inter-
face I can be formulated as an optimization problem:
we want to find the optimal mapping I
: D S with
regard to the success f
A
of the symbolic algorithm A.
The optimal interface I
maximizes the success f
A
,
i.e.,
I
= argmax
I
{ f
A
(I)|I M },
with set of interfaces or interface parameterizations
M . For this optimization formulation we have to de-
fine a quality measure f
A
with regard to the symbolic
algorithm A. The set S of interfaces may consists of
the same algorithm with various parameterizations. In
engineering practice the system constructor does not
spend time on the explicit design of the interface be-
tween subsymbolic and symbolic representations. It
is frequently an implicit result of the modeling pro-
cess, and the system constructor relies on the statisti-
ICAART 2011 - 3rd International Conference on Agents and Artificial Intelligence
466
Class A
Class B
Training set Test set
Class A
Class B
Unknown data
Training
Evaluation
Prediction
Optimization
Bias
Symbol A
Symbol B
External goal-
directed feedback
Symbol A
Symbol B
Optimizaiton
Module
Classication
Classication
Module
Figure 1: Classic data flow model for classification tasks
(lower part), complemented by the optimization module
that biases the classification task with regard to an external
goal-directed feedback (upper part).
cal capabilities of the learning methods. For the adap-
tation of an optimal interface I
a clear optimization
objective has to be specified. The main objective is
to map high-dimensional sensory data to a meaning-
ful, a viable, set of symbols of arbitrary shape. How
can this mapping be measured in terms of a feedback
f
A
from the symbolic algorithm? The feedback de-
pends on the goal of the autonomous agent. If it can
be explicitly expressed by a measure f
A
, an optimiza-
tion algorithm is able to evolve the interface. Figure 1
illustrates the optimization approach exemplarily for
a classification task. The optimization module biases
the classification task with regard to an external goal-
directed feedback.
The optimization formulation yields valuable in-
sights into the SGP. Learning and cognitive informa-
tion processing becomes a two-level mapping, firstly
from the space of subsymbolic data D to the space
of symbols S, secondly, from there to the meaning
of the symbols. Their semantics are implicitly bound
to the cognitive process A. During interface design,
the first part of the mapping is subject to optimization
while the second part guides this optimization pro-
cess. The whole process yields a grounding of sym-
bols arbitrary in shape, but based on objectives on
the functional level of semantics. Decontextualiza-
tion, which means to abstract from particular patterns
and the ability of a symbol to function in different
contexts, is less an interface design problem, but more
a problem on the symbolic level. Feedback varies
from situation to situation and from context to con-
text. A sophisticated system will be able to arrange
the feedback hierarchically, controlling feedback of
one level on a higher level. Here, we simplify the
model and have only feedback in mind feedback that
is necessary to ground symbols.
Not only positive, but also negative feedback lead
to self-regulation processes. It guides the optimiza-
tion process of the underlying machine learning tech-
niques. In general, the following scenarios for feed-
back acquisition are possible. In the offline feedback
response approach the symbolic algorithm runs for a
defined time, e.g., until a termination condition is met,
and propagates feedback f
A
that reflects its success
back to the optimization algorithm. If interface de-
sign is the only optimization objective the system will
adapt the interface to achieve a maximal response.
This process might be quite slow if the symbolic al-
gorithm is supposed to run for a long time to yield
f
A
. The offline feedback does not contradict the zero-
semantical commitment condition as the feedback is
the only source that guides the agent’s perception of
the experiential world.
5 A CASE STUDY
To concretize the described optimization perspective,
I present an artificial toy scenario that is ought to il-
lustrate the working mechanisms of the proposed ap-
proach. I start with an overview:
Perception. Random Gaussian data clouds are
produced at different times representing subsym-
bolic observations d D
Interface. A clustering approach I clusters the
data observations depending on two parameters,
leading to an assignment of observations to sym-
bols.
Hebbian Translation and Interface. Similar to
the Hebbian / STDP learning rule
1
temporal infor-
mation is used to translate concepts into proposi-
tional formulas. Basic inference processes (A) are
used to evaluate the interface.
Optimization. Free parameters are optimized, in
particular w.r.t. the interface, i.e., kernel density
clustering parameters.
5.1 Perception
Let us assume that a cognitive agent observes data
clouds consisting of N-dimensional points x R .
These clouds represent subsymbolic sensory input he
perceives in an observation space. The temporal con-
text of appearance and disappearance will induces a
causal meaning. Observations that belong to one con-
cept are subsumed to concept d
i
= {x
1
, . . . , x
N
} at
1
STDP means spike-timing-dependent plasticity and is
known to be one of the most important learning rules in the
human brain.
MACHINE SYMBOL GROUNDING AND OPTIMIZATION
467
time t. Such a data cloud is produced with the Gaus-
sian distribution N (ν, σ)
2
. Temporal information like
appearance and disappearance of data clouds is deter-
mined by a fixed scheme, see Section 5.3.
5.2 Interface
The machine learning interface has the task to assign
the observations to symbols. We employ a clustering
approach that assigns each cluster to a symbol. We
employ a simple kernel density clustering scheme de-
scribed in the following. Cluster centers are placed
in regions with high kernel densities, but with a least
distance to neighbored high kernel densities. For this
sake, iteratively the points with the highest relative
kernel density (Parzen, 1962)
d(x
j
) =
N
i=1,i6= j
K
H
(x
i
x
j
) > ε (1)
are identified, with a minimum distance ρ R
+
to
previously computed codebook vectors C , d(x
j
, c
k
) >
ρ for all c
k
C , and a minimum kernel density ε
R
+
. These points are added to set C of cluster-
defining codebook vectors. A symbol s
i
corresponds
to a codebook vector c
i
C with all closest obser-
vations resulting in a Voronoi-tesselation of the ob-
servation space D. The clustering result significantly
depends on the two free parameters ρ and ε that will
be subject to the optimization process.
5.3 Hebbian Translation and Inference
From the temporal information, i.e., appearance, dis-
appearance, and order, logical relations are induced,
and translated into propositional logic formulas. This
is an important and new step, and probably the most
interesting contribution of this toy scenario. We em-
ploy two important rules.
1. If two symbols occur at once, e.g., s
1
at t
1
and s
2
at t
2
with |t
1
t
2
| θ
1
, this event induces the for-
mula s
1
s
2
. Two concepts that occur at once are
subsumed and believed to belong together from a
logical perspective. This rule can be generalized
to more than two symbols.
2. If symbol s
2
at t
2
occurs within time window
[θ
1
, θ
2
] after symbol s
1
at t
2
, i.e., if θ
1
< t
2
t
1
<
θ
2
, this induces the implication rule s
1
s
2
. This
means, s
2
follows from the truth of s
1
(another in-
terpretation is that s
2
may be caused by s
1
).
2
N (ν, σ) represents Gaussian distributed numbers with
expectation value ν, and standard deviation σ.
Translated into propositional logic formulas, infer-
ence processes are possible, which represent higher
cognitive processes. For this sake a simple inference
engine is employed. The logical formulas induced by
the Hebbian process are compared to the original for-
mulas that are basis of the data generation process.
They are evaluated testing a set of evaluation formu-
las of the form (A, ) with formula A over the set of
symbols, with {true, f alse} corresponding to the
data generating set.
5.4 Optimization
The optimization process has the task to find parame-
ter settings for ρ and ε that allow an optimal inference
process w.r.t. a set of evaluation formulas. For this
sake we employ a (µ + λ)-evolution strategy (Beyer
and Schwefel, 2002). To evaluate the quality of the
interface, we aggregate two indicators: (1) the num-
ber N
f
of concepts (clusters) that have been found in
relation to the real number of symbols N
s
, and (2) the
number K of correct logical values when testing the
evaluation set formulas for feasibility. Both indicators
are subsumed to a minimization problem formulation
expressed in the fitness function
f
A
:= |N
f
N
s
| K. (2)
The problem of assigning the symbols to correct
atoms is solved by trying all possible assignments, the
highest K of matching logical values is used for eval-
uation. From another perspective, f is a measure for
the performance of the higher cognitive process A.
5.5 Results
As a first simple test case 15 logical formulas with
10 atoms (symbols) have been generated. As evalua-
tion set 20 formulas, 10 feasible and 10 infeasible, are
used. Each symbol is represented by a data cloud with
different parameterizations. A (15 + 100)-ES opti-
mizes the parameter of the kernel density clustering
heuristic. The optimization is stopped when no im-
provement could have been achieved for t = 100 gen-
erations. The experiments have shown that the system
was able to evolve reasonable parameters for ρ and
ε. The clustering process is able to identify and dis-
tinguish between concepts. In most runs the correct
number symbols have been retrieved from the data
cloud, the other runs only differ in at most 3 symbols.
In 83 of 100 runs at least 15 formulas of the evalu-
ation set match the observations. A careful experi-
mental analysis going beyond this case study, and an
extensive depiction of experimental results and tech-
nical details will be subject to future work. Neverthe-
ICAART 2011 - 3rd International Conference on Agents and Artificial Intelligence
468
less, I hope to demonstrate how the machine symbol
grounding perspective can be instantiated.
6 CONCLUSIONS
There are many similarities between constructivism
and the machine learning perspective. An au-
tonomous agent builds its own model of the per-
ceived environment, i.e., an individual representation
depending on its physical makeup and its algorithmic
learning capabilities. This point of view blends ma-
chine learning with constructivism resulting in some
sort of “machine constructivism”, i.e., an epistemol-
ogy of artificial autonomous systems that consist of
algorithmic and physical (real or virtual) machine en-
tities. It postulates that machine perception never
yields a direct representation of the real world, but
a construct of sensory input and its machine. Conse-
quently, “machine objectivity” in the sense of a con-
sistence between a perceived constructed model and
reality is impossible. Every machine perception is
subjective, in particular with regards to its sensors and
learning algorithms. Currently, machine subjectivity
is mainly determined by a statistical perspective and
biased by a set of test problems in machine learning
literature. Input-output relations, i.e., regularities and
invariances, are statistically analyzed and mapped to
internal machine representations. They are mainly
guided by learning algorithms and statistical formu-
las. However, from a machine learning perspective,
the perception of humans and other organisms is de-
termined by their physical and neural composition.
A target-oriented optimization process binds symbol
grounding to perception and a meaningful construc-
tion of representations of the environment. As future
technical work, we will conduct a careful experimen-
tal evaluation of the proposed artificial case study, and
extend the approach to real-world scenarios.
REFERENCES
A. Cangelosi, A. Greco, S. H. (2002). Symbol ground-
ing and the symbolic theft hypothesis, pages 91–210.
Springer.
Beyer, H.-G. and Schwefel, H.-P. (2002). Evolution strate-
gies - A comprehensive introduction. Natural Com-
puting, 1:3–52.
Craik, K. (1966). The Nature of Psychology. The University
Press, Cambridge.
Floridi, L. (2004). Open problems in the philosophy of in-
formation. Metaphilosophy, 35:554–582.
Glasersfeld, E. (1979). Cybernetics, experience, and the
concept of self. A cybernetic approach to the assess-
ment of children: Toward a more humane use of hu-
man beings, pages 67–113.
Glasersfeld, E. (1987). Wissen, Sprache und Wirklichkeit.
Viewweg, Wiesbaden.
Harnad, S. (1987). Categorical perception: the groundwork
of cognition. Applied systems and cybernetics, pages
287–300.
Harnad, S. (1990). The symbol grounding problem. Physica
D, pages 335–346.
Harnad, S. (1994). Computation is just interpretable symbol
manipulation: Cognition isn’t. Minds and Machines,
4:379–390.
Harnad, S. (2007). Symbol grounding problem. Scholarpe-
dia, 2(7):2373.
Newell, A. and Simon, H. A. (1976). Computer science as
empirical inquiry: Symbols and search. In Communi-
cations of the ACM, pages 113–126. ACM.
Parzen, E. (1962). On the estimation of a probability density
function and mode. Annals of Mathematical Statistics,
33:10651076.
Pfeier, R. and Iida, F. (2003). Embodied artificial intelli-
gence: Trends and challenges. Embodied Artificial
Intelligence, pages 1–26.
Russell, S. J. and Norvig, P. (2003). Artificial Intelligence:
A Modern Approach. Pearson Education, 2. edition.
Searle, J. R. (1980). Minds, brains, and programs. Behav-
ioral and Brain Sciences, 3:417–457.
Sun, R. (2000). Symbol grounding: A new look at an old
idea. Philosophical Psychology, 13:149–172.
Taddeo, M. and Floridi, L. (2005). Solving the symbol
grounding problem: a critical review of fifteen years
of research. Journal of Experimental and Theoretical
Artificial Intelligence, 17(4):419–445.
MACHINE SYMBOL GROUNDING AND OPTIMIZATION
469