Authors:
Asimina Mertzani
and
Jeremy Pitt
Affiliation:
Electrical and Electronic Engineering Dept., Imperial College London, London, U.K.
Keyword(s):
Cybernetics, Self-Regulated Systems, Requisite Influence, Reinforcement Learning, Social Psychology.
Abstract:
This paper specifies, implements and experiments with a new psychologically-inspired 4voices algorithm to be used by the units of a self-regulated system, whereby each unit learns to identify which of several “voices” to pay attention to, depending on a collective desired outcome (e.g., establishing the ground truth, a community truth, or their own “truth”). In addition, a regulator uses a standard Q-learning algorithm to pay attention to the regulated units and respond accordingly. The algorithm is applied to a problem of continuous policy-based monitoring and control, and simulation experiments determine which initial conditions produce systemic stability and what kind of “truth” is expressed by the regulated units. We conclude that this synthesis of Q-learning in the regulator and 4voices in the regulated system establishes requisite social influence . This maintains quasi-stability (i.e. periodic stability) and points the way towards ethical regulators.