Beyond Machine Learning: Autonomous Learning

eric Alexandre

1,2,3

Inria Bordeaux Sud-Ouest, 200 Avenue de la Vieille Tour, 33405 Talence, France

LaBRI, Universit

e de Bordeaux, Institut Polytechnique de Bordeaux, CNRS, UMR 5800, Talence, France

Institut des Maladies Neurod

eratives, Universit

e de Bordeaux, CNRS, UMR 5293, Bordeaux, France

Keywords:

Machine Learning, Computational Neuroscience, Autonomous Learning.

Abstract:

Recently, Machine Learning has achieved impressive results, surpassing human performances, but these pow-

erful algorithms are still unable to deﬁne their goals by themselves or to adapt when the task changes. In

short, they are not autonomous. In this paper, we explain why autonomy is an important criterion for really

powerful learning algorithms. We propose a number of characteristics that make humans more autonomous

than machines when they learn. Humans have a system of memories where one memory can compensate

or train another memory if needed. They are able to detect uncertainties and adapt accordingly. They are

able to deﬁne their goals by themselves, from internal and external cues and are capable of self-evaluation to

adapt their learning behavior. We also suggest that introducing these characteristics in the domain of Machine

Learning is a critical challenge for future intelligent systems.

1 INTRODUCTION

Machine Learning has achieved impressive results

recently, for a variety of applications ranging from

text and natural scene labeling (Farabet et al., 2013)

to games as difﬁcult as the Go (Clark and Storkey,

2015). Notably, these results have been mentioned as

surpassing human performances but this kind of state-

ments deserves some comments. It is true that the

AlphaGo computer program by Google DeepMind

was the ﬁrst to beat a professional player and that

some software tools developed by Facebook are able

to analyze a thousand of posts per second, clearly

beyond human performances. These undoubtly im-

pressive results mainly rely on one major character-

istic of their underlying algorithms like Deep Learn-

ing (LeCun et al., 2015): their learning is based on a

combinatorial analysis of cases extracted from huge

databases, speciﬁcally dedicated to their narrow do-

main of expertise. AlphaGo can only play Go. A deep

learning system trained to extract faces from pictures

can only identify faces.

In contrast, one remarkable characteristic of hu-

man learning is that, whereas we are in general not

excellent in one speciﬁc domain, we are quite good in

most of them. In addition, we are able to adapt when a

new problem appears. This high level of adaptability

can be seen by other signs. With neither explicit la-

bels, nor data preprocessing or segmentation, we are

able to pay attention to important information and ne-

glect noise. We deﬁne by ourselves our goals and

the means to reach them, we self-evaluate our perfor-

mances and re-exploit previously learned knowledge

and strategies in some different context. All these

characteristics are notably absent from classical Ma-

chine Learning approaches.

In summary, whereas Machine Learning demon-

strates an impressive brute force of learning in spe-

ciﬁc domains, humans are versatile and adaptable and

can learn in a changing and uncertain world: We are

good at autonomous learning. Comparing both kinds

of learning is difﬁcult because they address differ-

ent characteristics; nevertheless, it can be said that

autonomous learning is probably an important char-

acteristic, as one want to embed intelligent modules

in robots or in interfaces dedicated to act in the real

world. It is consequently important to wonder how

Machine Learning could integrate more autonomy, in-

stead of just developing more power as it is mainly

proposed in most research programs.

The goal of this position paper is not to compare

the performances of two approaches, bio-inspired and

purely based on mathematics. Neither is it to give

precise recipes to improve existing techniques. In-

stead, we would like to convince the reader that au-

tonomy is a primary property for learning and to in-

Alexandre, F.

Beyond Machine Learning: Autonomous Learning.

DOI: 10.5220/0006090300970101

In Proceedings of the 8th International Joint Conference on Computational Intelligence (IJCCI 2016) - Volume 3: NCTA, pages 97-101

ISBN: 978-989-758-201-1

troduce two kinds of information that might be useful

to develop more autonomous machine learning. One

the one hand, we would like to deﬁne more precisely

what being autonomous means: even if this property

may seem obvious, it is sometimes difﬁcult to con-

cretely explain which characteristics are the bases of

an autonomous analysis of our situation and of the

best decision to make. One the other hand, we pro-

pose to mention the main cerebral structures and cir-

cuits involved in these facets of autonomous behav-

ior. This might be useful to decide for future research

topics. To do so, we propose to describe here a se-

ries of characteristics of the human brain that partic-

ipates in our capability to learn autonomously, with

the idea that introducing these characteristics in arti-

ﬁcial neural systems could orient Machine Learning

toward Autonomous Machine Learning.

2 AN INTERACTING SYSTEM OF

MEMORIES

It is known for a long time (Squire, 1992) that speciﬁc

circuits in the brain are mobilized to learn explicit

knowledge and others to learn procedures. These

functions have been respectively addressed by recur-

rent (Hopﬁeld, 1982) and layered (Rosenblatt, 1958)

neural networks, the latter being the ancestor of the

deep learning networks. It has also been advocated

(Alexandre, 2000) that realistic models of memory

should include both kinds of networks to be able to

learn by heart speciﬁc events as well as generalize

some skills from a set of experiences.

Besides modeling these circuits, studying their in-

teractions is also crucial to understand how one sys-

tem can compensate for or supervise another system,

resulting in a more autonomous learning. In particu-

lar, interaction between systems can lead to situations

where one system can propose an answer (possibly of

lower quality) if the other one is too speciﬁc or is not

trained enough or even if it has been damaged. It can

consequently give more time to the other one to be-

come more general or more mature, or to recover. As

we will exemplify below, interaction between systems

can also more directly give the opportunity to one sys-

tem to send well selected cases to train the other sys-

tem. In both cases, the mechanisms are internal and

do not require help from the external world, hence an

increased autonomy.

For example, in the domain of perceptual learning

in the medial temporal lobe, models of the hippocam-

pus can store in episodic memories important events

in one trial (Kassab and Alexandre, 2015; O’Reilly

and Rudy, 2001). This neural structure is also known

(McClelland et al., 1995) to form later, by consoli-

dation in other circuits, new semantic categories. In

the domain of decision-making in the loops between

the prefrontal cortex and the basal ganglia, models of

cerebral mechanisms are currently developed (Piron

et al., 2016), by which goal-directed behavior relying

on explicit evaluation of expected rewards can later

become habits, automatically triggered with less ﬂex-

ibility but increased effectiveness.

In both cases (either perceptual or motor), the

strategy of learning is ﬁrst to store some speciﬁc cases

of interest and to recall them if necessary. Then if it

appears that similar cases frequently occur, the strat-

egy will be to ﬁnd some generality and build a generic

rule or procedure to deal with such cases. Building

such rules from the initially stored cases has several

advantages, among which autonomy is not the least.

Understanding how both kinds of memory cooperate

can lead to an autonomous learning system, able to

cope with facts and rules, to elaborate rules from se-

lected facts and to decide which kind of memory is

the most adapted to the current situation.

3 COPING WITH UNCERTAINTY

We learn the rules that govern the world and con-

sider it uncertain for two main reasons: it can be

predictable up to a certain level (stochastic rules) or

non-stationary (changing rules). Whereas standard

probabilistic models are rather good at tackling the

ﬁrst kind of uncertainty, non-stationarity in a dynamic

world raises more difﬁcult problems (Cohen et al.,

2007). Learning in autonomy in the real (and hence

uncertain) world consequently implies to be able to

autonomously give the best explanation to the fact

that a previously valid rule has given an unsatisfac-

tory result: Is it just noise or has the rule changed

? It also implies of course to trigger the correspond-

ing best answer (respectively modifying the level of

stochasticity associated to the rule or, more critically,

selecting another rule in the set of previously designed

rules or elaborating a new rule).

Concerning the ﬁrst point, we are studying how

regions of the medial prefrontal cortex are detecting

and evaluating the kind and the level of uncertainty

by monitoring recent history of performance at man-

aging correctly incoming events (Carrere and Alexan-

dre, 2016a). Concerning the second point, these pre-

frontal regions and other cerebral regions sensitive

to reward prediction errors are also reported to acti-

vate the release of neuromodulators like monoamines,

known to play a central role in adaptation to uncer-

tainties (Doya, 2002; Alexandre and Carrere, 2016).

NCTA 2016 - 8th International Conference on Neural Computation Theory and Applications

Acetylcholine has been reported to be an impor-

tant factor in case of stochasticity (Yu and Dayan,

2005) and has been modeled as increasing the

signal-to-noise ratio in the sensory cortex (Pauli and

O’Reilly, 2008) and promoting learning about the

context (Carrere and Alexandre, 2015). We have also

recently studied the role of noradrenaline in unstation-

ary environments (Carrere and Alexandre, 2016b) and

have proposed a biologically-inspired model, propos-

ing that the balance between exploration and exploita-

tion of sensory cues associated to a rule can be mod-

iﬁed by the action of noradrenaline on critical cere-

bral regions. Similarly, the tonic level of dopamine

has been reported as increasing when an unstation-

ary environment is detected and as modifying the bal-

ance between exploration and exploitation of motor

aspects of the rule (Humphries et al., 2012). Alto-

gether, the brain can be presented as a system able to

autonomously detect the level of uncertainty and to

autonomously modify the way to exploit and update

previously acquired rules or to design new rules, with

the help of neuromodulators, seen as a meta-learning

system modifying hyperparameters of learning algo-

rithms (Doya, 2002).

4 DEFINING GOALS BY

EMBODIMENT AND

EMOTIONAL LEARNING

One major difference between artiﬁcial and natu-

ral learning systems is that the latter ones can au-

tonomously detect and deﬁne their goals in the sur-

rounding environment. This is important to choose

what to learn and to orient attentional systems accord-

ingly. Instead of learning from a corpus prepared of-

ﬂine, learning is made online and adapted to what has

happened during the behavior. In addition, if learn-

ing is centred on cues that are important for the agent,

like mates, preys or predators, the agent will be prob-

ably more efﬁcient than if learning is made from a

randomly sampled corpus.

The ability to detect ones goals is due to several

ingredients. First our body itself tells us by interocep-

tion (Craig, 2003) what is good or bad for us; what

must be searched or avoided. It is consequently im-

portant for a really autonomous learning agent not

only to have a model of the brain with classical cog-

nitive functions related to perception, learning, atten-

tion or deliberation but also to feed that model with

information coming from a substrate corresponding

to the body, including sensors for pain and pleasure.

In the cerebral system, the perceptive system

is pre-wired to automatically detect biologically-

signiﬁcant aversive and appetitive (emotional) stimuli

and to trigger pavlovian reﬂexes (Kim et al., 2013).

Subsequently, pavlovian learning will allow to an-

ticipate these stimuli by the detection of predictive

stimuli that will in turn trigger preparatory behavior

(Cardinal et al., 2002). All these stimuli are key tar-

gets for attentional processing and correspond to the

main goals organizing the behavior. Learning to de-

tect them automatically is consequently important for

autonomy, since attentional and learning systems will

be fed by an over-representation of these meaningful

examples, in contrast to artiﬁcial systems that only

learn from a stereotyped and artiﬁcially prepared cor-

pus. It is consequently important to propose models

implementing these pavlovian mechanisms (Krasne

et al., 2011; Carrere and Alexandre, 2015) and also

the effects of Pavlovian responses onto the body and

the neuromodulatory system (Carrere and Alexandre,

2016b).

5 MOTIVATION AND

SELF-EVALUATION

Specifying relations between the brain and the body is

also an opportunity to introduce physiological needs,

fundamental to consider internal goals in addition to

external goals evoked above. Indeed, it is important

for an agent to learn autonomously, as one of its ma-

jor constraints is to survive by monitoring some inter-

nal variables within vital bounds. One of its primary

goals will be to elaborate and select behaviors that

help controlling these variables, which can be done in

autonomy and not by obeying a supervising system.

It is consequently required that the critical internal

variables be perceived, carefully processed and par-

ticipate to decision making: They are undoubtly key

cues for an organization of behavior decided in full

autonomy.

Such a consideration is also the basis for renewed

approaches regarding reinforcement learning. Indeed

an important aspect of autonomous decision mak-

ing is to be able to adapt the behavior as a compro-

mise between external goals (what should be searched

and avoided) and internal goals (what are the current

needs), whereas classical reinforcement learning gen-

erally relies on optimizing a simple scalar represent-

ing an abstract reward, artiﬁcially given in some sit-

uations. To progress in that aim, it is fundamental

to better understand how internal and external goals

(motivational and emotional cues) are combined in

decision making (Zahm, 1999; Mannella et al., 2013;

Kolling et al., 2016).

Beyond Machine Learning: Autonomous Learning

In humans, another important source of infor-

mation for learning autonomously is based on self-

evaluation of the performances. Not only this infor-

mation can recommand what to learn but, more im-

portantly, it can orient the behavior to explore situ-

ations that are best adapted to current expertise and

give what is called intrinsic motivations (Oudeyer

et al., 2007). It is noticeable that both motivation

and self-evaluation processing are central in cognitive

control (Koechlin et al., 2003) and reported to be lo-

cated in the anterior part of the prefrontal cortex, that

should be consequently critical targets for future re-

searches, particularly to understand how behavioral

rules are elaborated and selected from self-evaluation

in the prefrontal cortex (Badre, 2008; Donoso et al.,

2014).

6 DISCUSSION

In this position paper, we have ﬁrst noticed that nat-

ural and artiﬁcial learning systems differ because,

whereas the latter ones are high level specialists in

a restricted and artiﬁcially sampled domain, the for-

mer ones are rather characterized by their intrinsic

adaptability to any situation and their capacity to up-

date their knowledge and skills accordingly. Basi-

cally, this refers to the function of learning in living

systems. Their primary goal is to survive and breed

in an unknown environment. As their environment is

generally too complex and changing to only consider

pre-wired behavioral rules, their knowledge and skills

must evolve and adapt to what is perceived internally

and externally.

Of course, even in living agents, a part of the adap-

tation can be dictated during epigenesis by external

resources, like genes or social and educational envi-

ronments, but, most of the time, this adaptation must

be done in autonomy and is the main goal of learn-

ing processes. In this paper, we have consequently

set the focus on autonomy, seen as a primary ingredi-

ent of learning and we have explored more precisely

several characteristics that, we believe, are the way

autonomy can be expressed during learning. In short,

we propose that autonomous learning is made easier

because our different systems of memory can inter-

act and exchange information, because we are able to

estimate the kind of uncertainty we are facing and to

propose the suitable adaptation, because we can per-

ceive important noxious and positive events in the en-

vironment and inside our body, learn to predict them

and build the underlying behavioral rules to optimize

in some way their occurrence or avoidance.

We have also mentioned the hypothetized under-

lying cerebral circuitry for each of these mechanisms

and have reported modeling efforts to better under-

stand them. Importantly, we think that these mod-

els are important not only for computational neuro-

science but also for Machine Learning. Transfer-

ing these principles to classical learning algorithms

would endow them with more autonomy, which is

critical in a context where it is more and more aimed

at integrating an Artiﬁcial Intelligence in autonomous

agents and interfaces.

Up to now, we have only enumerated a list of char-

acteristics, whereas an essential goal for a real auton-

omy would be to integrate all of them in an agent,

corresponding to a physically identiﬁed and separated

entity, thrown in an unknown environment with the

recommendation to survive and no subsequent help.

In this perspective, we have recently designed a

software platform (Denoyelle et al., 2014) where an

autonomous agent with an artiﬁcial body can explore

an unknown virtual world. The platform is designed

in such a way that the characteristics of the agent’s

body and of the environment can be easily speciﬁed

and long lasting experiments can be run to evaluate

its survival performances. The main challenge is now

to integrate more and more sophisticated versions of

the mechanisms evoked above to better understand

how they interact and how a viable Autonomous Ma-

chine Learning framework can be deﬁned. We are

also presently experimenting that, beyond Machine

Learning, this numerical testbed is also a precious

simulation tool for our medical and neuroscientist col-

leagues.

REFERENCES

Alexandre, F. (2000). Biological Inspiration for Multiple

Memories Implementation and Cooperation. In In

V. Kvasnicka P. Sincak, J. Vascak and R. Mesiar, ed-

itors, International Conference on Computational In-

telligence.

Alexandre, F. and Carrere, M. (2016). Modeling neuromod-

ulation as a framework to integrate uncertainty in gen-

eral cognitive architectures. In Steunebrink, B., Wang,

P., and Goertzel, B., editors, Artiﬁcial General Intelli-

gence: 9th International Conference, AGI 2016, New

York, NY, USA, July 16-19, 2016, Proceedings, pages

324–333. Springer International Publishing.

Badre, D. (2008). Cognitive control, hierarchy, and the ros-

trocaudal organization of the frontal lobes. Trends in

Cognitive Sciences, 12(5):193–200.

Cardinal, R. N., Parkinson, J. A., Hall, J., and Everitt, B. J.

(2002). Emotion and motivation: the role of the amyg-

dala, ventral striatum, and prefrontal cortex. Neuro-

science & Biobehavioral Reviews, 26(3):321–352.

Carrere, M. and Alexandre, F. (2015). A pavlovian model of

NCTA 2016 - 8th International Conference on Neural Computation Theory and Applications

100

the amygdala and its inﬂuence within the medial tem-

poral lobe. Frontiers in Systems Neuroscience, 9(41).

Carrere, M. and Alexandre, F. (2016a). A system-level

model of noradrenergic function. In The Sixth Joint

IEEE International Conference on Developmental

Learning and Epigenetic Robotics, September 19th

22nd 2016 Cergy-Pontoise. IEEE.

Carrere, M. and Alexandre, F. (2016b). A system-level

model of noradrenergic function. In Villa, E. A., Ma-

sulli, P., and Pons Rivero, J. A., editors, Artiﬁcial Neu-

ral Networks and Machine Learning – ICANN 2016:

25th International Conference on Artiﬁcial Neural

Networks, Barcelona, Spain, September 6-9, 2016,

Proceedings, Part I, pages 214–221. Springer Inter-

national Publishing.

Clark, C. and Storkey, A. J. (2015). Training Deep Con-

volutional Neural Networks to Play Go. In Proceed-

ings of the 32nd International Conference on Machine

Learning, volume 37, pages 1766–1774.

Cohen, J. D., McClure, S. M., and Yu, A. J. (2007).

Should I stay or should I go? How the human brain

manages the trade-off between exploitation and ex-

ploration. Philosophical transactions of the Royal

Society of London. Series B, Biological sciences,

362(1481):933–942.

Craig, A. D. (2003). Interoception: the sense of the phys-

iological condition of the body. Current Opinion in

Neurobiology, 13(4):500–505.

Denoyelle, N., Pouget, F., Vi

eville, T., and Alexandre,

F. (2014). VirtualEnaction: A Platform for Sys-

temic Neuroscience Simulation. In International

Congress on Neurotechnology, Electronics and Infor-

matics, Rome, Italy.

Donoso, M., Collins, A. G., and Koechlin, E. (2014). Foun-

dations of human reasoning in the prefrontal cortex.

Science (New York, N.Y.), 344(6191):1481–1486.

Doya, K. (2002). Metalearning and neuromodulation. Neu-

ral Networks, 15(4-6):495–506.

Farabet, C., Couprie, C., Najman, L., and LeCun, Y.

(2013). Learning hierarchical features for scene la-

beling. IEEE Transactions on Pattern Analysis and

Machine Intelligence, 35(8):1915–1929.

Hopﬁeld, J. J. (1982). Neural networks and physical sys-

tems with emergent collective computational abilities.

In Proceedings of the National Academy of Sciences,

USA, pages 2554–2558.

Humphries, M. D., Khamassi, M., and Gurney, K. (2012).

Dopaminergic control of the exploration-exploitation

trade-off via the basal ganglia. Frontiers in Neuro-

science, 6(9).

Kassab, R. and Alexandre, F. (2015). Integration of extero-

ceptive and interoceptive information within the hip-

pocampus: a computational study. Frontiers in Sys-

tems Neuroscience, 9(87).

Kim, D., Par

e, D., and Nair, S. S. (2013). Mechanisms

contributing to the induction and storage of Pavlovian

fear memories in the lateral amygdala. Learning &

memory (Cold Spring Harbor, N.Y.), 20(8):421–430.

Koechlin, E., Ody, C., and Kouneiher, F. (2003). The Archi-

tecture of Cognitive Control in the Human Prefrontal

Cortex. Science, 302(5648):1181–1185.

Kolling, N., Behrens, T. E. J., Wittmann, M. K., and Rush-

worth, M. F. S. (2016). Multiple signals in anterior

cingulate cortex. Current Opinion in Neurobiology,

37:36–43.

Krasne, F. B., Fanselow, M. S., and Zelikowsky, M. (2011).

Design of a Neurally Plausible Model of Fear Learn-

ing. Frontiers in Behavioral Neuroscience, 5:41+.

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learn-

ing. Nature, 521(7553):436–444.

Mannella, F., Gurney, K., and Baldassarre, G. (2013). The

nucleus accumbens as a nexus between values and

goals in goal-directed behavior: a review and a new

hypothesis. Frontiers in behavioral neuroscience, 7.

McClelland, J. L., McNaughton, B. L., and O’Reilly, R. C.

(1995). Why there are complementary learning sys-

tems in the hippocampus and neocortex: insights

from the successes and failures of connectionist mod-

els of learning and memory. Psychological review,

102(3):419–457.

O’Reilly, R. C. and Rudy, J. W. (2001). Conjunctive Rep-

resentations in Learning and Memory: Principles of

Cortical and Hippocampal Function. Psychological

Review, 108(2):311–345.

Oudeyer, P.-Y., Kaplan, F., and Hafner, V. (2007). Intrinsic

motivation systems for autonomous mental develop-

ment. IEEE Transactions on Evolutionary Computa-

tion, 11(2):265–286.

Pauli, W. M. and O’Reilly, R. C. (2008). Attentional control

of associative learning–a possible role of the central

cholinergic system. Brain Research, 1202:43–53.

Piron, C., Kase, D., Topalidou, M., Goillandeau, M.,

Rougier, N. P., and Boraud, T. (2016). The globus pal-

lidus pars interna in goal-oriented and routine behav-

iors: Resolving a long-standing paradox. Movement

Disorders.

Rosenblatt, F. (1958). The perceptron: a probabilistic model

for information storage and organization in the brain.

In Anderson, J. A. and Rosenfeld, E., editors, Neu-

rocomputing: Foundations of Research (1989), pages

89–92. The MIT Press.

Squire, L. R. (1992). Declarative and nondeclarative mem-

ory: multiple brain systems supporting learning and

memory. J Cogn Neurosci, 4(3):232–243.

Yu, A. J. and Dayan, P. (2005). Uncertainty, Neuromodula-

tion, and Attention. Neuron, 46(4).

Zahm, D. S. (1999). Functional-anatomical implications

of the nucleus accumbens core and shell subterrito-

ries. Annals of the New York Academy of Sciences,

877:113–128.

Beyond Machine Learning: Autonomous Learning

101