COORDINATION OF A PROTOTYPED MANIPULATOR BASED

ON AN EXPERIMENTAL VISUO-MOTOR MODEL

Renato de Sousa Dâmaso, Mário Sarcinelli Filho, Teodiano Freire Bastos Filho

Department of electrical engineering, Federal University of Espírito Santo, Av. Fernando Ferrari, 514, Vitória, Brazil

Tarcisio Passos Ribeiro de Campos

Department of nuclear engineering, Federal University of Minas Gerais, Av. Antônio Carlos, 6627, Belo Horizonte, Brazil

Keywords: Visuo-motor coordination, Control of manipulators, Object grasping, Uncalibrated binocular vision.

Abstract: This paper presents a strategy to build an experimental visuo-motor model for a manipulator coupled to a

binocular vision system, which discards any previous algebraic model and any calibration of either the

manipulator or the vision system. The space spanned by a set of selected image features is divided in

regions, and the estimated visuo-motor model is represented by a matrix of constant elements associated to

each one of such regions. Such matrices are obtained in an incremental way, starting from commands of

movement and using the measurements of the variations they cause in the set of image features. Even when

partially filled in, the visuo-motor model can be used for coordinating the manipulator in order to get its

end-effector closer to an object and to grasp it. Preliminary results got from the implementation of the

proposed strategy in a prototyped manipulator coupled to a binocular vision system are also presented.

1 INTRODUCTION

Typical methods for the visual servo-control of

manipulators to grasp objects use analytical models

previously established, thus demanding the

knowledge of the geometry, of the mechanical and

of the optical system parameters, as it is exemplified

in Dâmaso et al. (2003). Transformations from the

articulate space of the manipulator to the global

inertial space, from this to the coordination system

associated to the cameras, and, finally, from the last

to the coordination systems of the image planes are

accomplished. These procedures result in a nonlinear

matrix transformation, called Jacobian matrix

(Hutchinson et al., 1996). However, in many real

situations, this model may be too difficult to obtain.

In situations not requiring critical system

performance, like to grasp static objects or objects

moving slowly, it becomes interesting to estimate

models starting from sending movement commands

to the joints of the manipulator and measuring the

variations of certain visual cues (Hollinghurst and

Cipolla, 1994; Graefe, 1995; Xie et al., 1997). A

control system with these features would have the

capability of learning through its own experiences,

becoming able to approach a near object and to

grasp it. It is also desirable that the estimation of a

visuo-motor model can be completed as quickly as

possible (what means with just a few motion

examples) in a non-supervised way. Other desirable

characteristics such a system should exhibit are the

capability to “remember” what was learnt in

previous experiments, like it is recommended in

Graefe (1995), and the capability to "re-learn", so

that it can adapt the model to eventual changes.

In previous works, Hollinghurst and Cipolla (1994)

applied an affine stereo formulation to estimate a

matrix relating, qualitatively, the articulation’s

positions of the manipulator to a fixed point position

in its claw onto both images. Such a matrix was used

in an object grasping task, matching position and

orientation. Hosoda and Asada (1997) proposed an

adaptive control strategy based on the on-line

estimation of the Jacobian matrix, with no a priori

knowledge of the kinematics and the parameters of

the camera-manipulator system as well. This

estimation was iteratively performed, and it was

assumed that the coefficients did not converge to the

true values of the Jacobian matrix. However, the

estimation was precise enough, in addition to the

closed loop control, to guide the manipulator when

304

de Sousa Dâmaso R., Sarcinelli Filho M., Freire Bastos Filho T. and Passos Ribeiro de Campos T. (2005).

COORDINATION OF A PROTOTYPED MANIPULATOR BASED ON AN EXPERIMENTAL VISUO-MOTOR MODEL.

In Proceedings of the Second International Conference on Informatics in Control, Automation and Robotics - Robotics and Automation, pages 304-309

DOI: 10.5220/0001191303040309

Copyright

c

SciTePress

following previously defined paths. By their turn,

Xie, Graefe and Vollmann (1997) apply a procedure

based on attempting and error, which is described in

(Graefe, 1995), to guide the end-effector to an

object. After each attempting, only the articulated

positions corresponding to the position of the object

in the image planes were stored. After several

attempts, the stored information was used to obtain

the articulate coordinates corresponding to a non-

recorded position of the object, through the

interpolation of neighbouring positions.

In this paper, a direct process to estimate the matrix

relating the motor actions to the corresponding

variations in a set of image features is investigated.

By analogy with the visual servoing procedure, the

estimated visuo-motor matrices (Ĥ) is associated to

the function carried out by the Jacobian matrix, or

imend

qH

.

ˆ

&

&

=

ξ

, (1)

where

⎥

⎥

⎥

⎦

⎤

⎢

⎢

⎢

⎣

⎡

=

333231

232221

131211

ˆ

hhh

hhh

hhh

H

, (2)

end

ξ

&

is a vector containing the variations in the

image features of the end-effector,

im

q

&

is a vector

containing the velocities of the motors and h

ij

are the

coefficients of Ĥ (i and j vary from 1 to 3,

hereinafter).

As it will become clearer ahead, this matrix

constitutes a linear approach for the visuo-motor

correlation, which is a nonlinear one. Thus, the

matrix Ĥ only represents an "acceptable" approach

in an area close to the one in which the coefficients

have been estimated. To overcome this limitation,

the space spanned by the set of selected image

features is divided in areas, and a data structure is

created to store the matrix Ĥ relative to each area

(coefficients of Ĥ locally distributed).

2 EXPERIMENTAL PLATFORM

The system used for testing the strategy under

analysis, based on an incremental visuo-motor

model, is shown in Figure 1. A sketch of the

kinematical chain of the manipulator prototype is

also shown. The vision system, also represented in

Figure 1, is composed of a pair of cameras attached

to the manipulator base. The positive direction

adopted for the movement of each joint is also

indicated in Figure 1. The corresponding motors are

driven to one of the eight speed levels indicated in

Table 1, in degrees per second.

Table 1: The set of values of the speed levels for each

motor in the manipulator

Speeds

1m

q

&

[°/s]

52

,,

mm

qq

&&

⋅⋅⋅

[°/s]

level 0 0 0

level 1 9,0 36,9

level 2 13,8 54,9

level 3 18,5 73,4

level 4 24,3 98,1

level 5 31,2 126,0

level 6 43,7 176,4

level 7 54,8 220,5

Both cameras are analogues, and are oriented such

that their optical axes are close to parallel. Two

frame grabbers are installed in a computer, called

control computer, to acquire the images delivered by

the cameras. The image-processing algorithms

allowing extracting the characteristics of interest,

which are executed in the control computer,

complete the experimental setup employed.

A small dark parallelepiped (26 mm x 24 mm x 8

mm) is used as object (considered as a punctual

object. The coordinates of the centroid of such

object in the image planes are measured by the

vision system, whose sampling period is 100 ms.

3 THE PROPOSED STRATEGY

The procedure presented in the sequence is based on

visual information, and corresponds to two modes: i)

Perception, involving the estimation of the

coefficients that relate the space of motion

commands to the space spanned by the selected

image features; and ii) Action (or Coordination),

including the transformation of the visuo-motor

information estimated in the previous step to signals

effectively acting on the system to coordinate the

Figure 1: A view of the manipulator prototype and the

two cameras of the vision system, and its structure

COORDINATION OF A PROTOTYPED MANIPULATOR BASED ON AN EXPERIMENTAL VISUO-MOTOR

MODEL

305

articulated structure of the manipulator for

accomplishing the task of interest.

Figure 2 shows a pair of binocular images, on which

the axes u and v of the coordinate systems associated

to the left and right image planes are indicated. The

image features measured by the vision system

(Damaso et al., 2004) are also pointed out. To

control the first three DOF of the manipulator (to

control the position of its end-effector related to the

object), the following variables were selected

T

rigendleftendleftendend

uvu ] , ,[

___

=

ξ

(3)

T

rigobjleftobjleftobjobj

uvu ] , ,[

___

=

ξ

. (4)

The position of the robot end-effector is defined by

the coordinates of a hypothetical point, marked as +

in the image planes. A signature was used to allow

finding such hypothetical point, as depicted in

Figure 2. It is a black rectangle with two white balls,

and is fixed to the robot end-effector (Dâmaso et al.,

2003). The abscissa of the hypothetical point is

equal to the abscissa of the centroid of the white ball

having the smallest ordinate. The ordinate of the

hypothetical point is equal to the ordinate of such

centroid less four times the Euclidian distance

among the centroids of both white balls.

The possible values of the

ξ

end

components,

generically given by the dimensions of the image

planes (640 by 480 pixels), define a three-

dimensional space of characteristics (domain). The

division of this domain in areas or cells is proposed,

as represented in the image planes shown in Figure

2. The same procedure is applied to

ξ

obj

, which

conveys information about the object, in relation to

the base movements, with the difference that the

intervals in u

left

and u

rig

were chosen as being twice

bigger (see in Figure 3). The smaller number of cells

associated to the variations of the base articulation

(J1) can be justified by the fact that the movements

of such articulation do not produce significant

variations in the object depth. Thus, the values of

ξ

end

and

ξ

obj

address the cells in which the end-

effector and the object are placed, respectively. In

each cell, the estimated coefficients are constant.

Starting from the selected image features, the

variables (

∆

error_u,

∆

v_end_left,

∆

disp_end) were

defined, as follows,

2/)( _

____ rigendrigobjleftendleftobj

uuuuuerror −

+

−

=

, (5)

leftend

vleftendv

_

__

=

, (6)

rigendleftend

uuenddisp

__

_ −=

. (7)

Such variables represent the variations generated

during a fixed time interval. They are suitable to

express the movement of the end-effector or the

object in the image planes and in depth, once u

end_left

and u

end_rig

present very close variations for the

camera configuration (parallel optical axes).

Then, Table 2 is generated, which is used to estimate

the coefficients h

ij

of Ĥ, regarding the time interval

∆

t during which the movement is performed.

Table 2: Coefficients of the transformation matrix

∆

error_u

∆

v_end_left

∆

disp_end

tq

m

∆

⋅

3

&

h

11

h

21

h

31

tq

m

∆

⋅

2

&

h

12

h

22

h

32

tq

m

∆

⋅

1

&

h

13

h

23

h

33

3.1 Visuo-Motor Model Estimation

After being moved to the initial position, similar to

the position illustrated in Figure 2, the manipulator

is commanded to move the joint J2 (shoulder). The

Figure 2: Example of a pair of images of the binocular arrangement, showing the image coordinate systems (in pixels), the

selected image features and the splitting of the space spanned by the image features for the end-effector

ICINCO 2005 - ROBOTICS AND AUTOMATION

306

image features at the beginning and at the end of the

movement are measured and the coefficients

⎟

⎟

⎠

⎞

⎜

⎜

⎝

⎛

∆⋅

∆

=

∆⋅

∆

=

∆⋅

∆

=

tq

enddisp

h

tq

leftendv

h

tq

uerror

h

mmm 2

32

2

22

2

12

_

,

__

,

_

&&&

(8)

are evaluated. The joint J2 is then pulled back to its

initial position, procedure that is repeated for J3 and

J1. Regarding the movement of the joint J1, as the

cameras are fixed to the manipulator base, it does

not result in any change in the position of the end-

effector in the image planes. Thus, it is necessary

that the object stays stopped in the space during an

experiment on the "Perception" mode, serving as a

reference to the base movements (landmark). It is

used the correspondence that the end-effector

movement, in this case, is equal to the inverse of the

object movement in the image planes, for the

calculation of the coefficients h

13

, h

23

and h

33

. It

should be observed that the position of the object is

not learnt, and the object can (actually it should) be

put in different positions on the running of various

experiments.

At the end of this initial training step, all the

coefficients of Ĥ would be estimated for the initial

positions of the manipulator and the object, thus

allowing knowing how to move the manipulator in a

rough way. Continuing in the "Perception" step, the

end-effector is commanded to get progressively

closer to the object, while allowing moving just one

of the first three joints in each iteration, in an

alternate way. For doing that, the control system

verifies if the coefficients h

11

, h

21

and h

31

of the

transformation matrix corresponding to the

addressed cell were not estimated, sending an

actuation command to the joint J3. Case this

estimation has been already performed, it verifies if

the parameters h

12

, h

22

and h

32

are missing, and sends

an actuation command to joint J2 if affirmative.

Finally, if the coefficients of both lines have already

been estimated, it alternates the commands of J3 and

J2. The articulation J1 is commanded between the

command of J3 and J2, since the movement of either

J3 or J2 does not change the position of the object.

Thus, new columns of coefficients of Ĥ are

generated and stored in the corresponding cells. If

there is a previous value for the estimated

coefficient, they are averaged with the new values,

before being updated. At the end of each

experiment, the columns of coefficients of the

transformation matrix in the data structures are

copied to files, thus allowing saving values which

are loaded to the data structures in the beginning of

another experiment, giving to the system the ability

of memorizing any estimated model.

3.2 Visuo-Motor Coordination

The values stored in the two data structures of the

incremental visuo-motor model can be used to

coordinate the manipulator. At the "coordination"

step the first three articulations of the manipulator

can be commanded simultaneously, together with J4,

which is moved in order to keep the claw

approaching the horizontal. In this step, however,

there will be no estimation of the coefficients of Ĥ.

The manipulator is initially commanded to its initial

position. The image features are measured, and it is

verified if all the coefficients of the cell addressed

by the end-effector ((h

11

, h

21

, h

31

) and (h

12

, h

22

, h

32

))

and the coefficients of the cell filled by the object

(h

13

, h

23

, h

33

), were estimated. If this has not

happened, the coefficients associated to the last

totally filled area that the manipulator and the object

passed by are adopted. This is a solution that

degrades the performance of the estimated model,

but it is expected to be very seldom when a great

number of experiments is run at the "Perception"

mode for various positions of the object placed on

the space of interest.

The desirable variations in the image features are

obtained starting from the expressions of

proportional control action. This means that

uerrorKuerror __

1

⋅

=

∆

, (10)

leftverrorKleftendv ____

2

⋅=∆

, (11)

disperrorKenddisp __

3

⋅

=

∆

, (12)

with

leftendleftobj

vvleftverror

__

__

−

=

, (13)

enddispobjdispdisperror _ _ _ −=

, (14)

rigobjendobj

uuobjdisp

__

_

−

=

. (15)

The proportional gains were experimentally

adjusted, resulting in K

1

= 0.20, K

2

= 0.22 and K

3

=

0.22. Then, the speeds of the motors corresponding

to the articulations are calculated through the

solution of the linear equations (

∆

t is 1 s)

⎪

⎩

⎪

⎨

⎧

∆=⋅+⋅+⋅

∆=⋅+⋅+⋅

∆=⋅+⋅+⋅

enddispqhqhqh

leftendvqhqhqh

uerrorqhqhqh

mmm

mmm

mmm

_

__

_

133232331

123222321

113212311

&&&

&&&

&&&

. (16)

Finally, each evaluated speed is match to one of the

speed levels shown in Table 1. The control system

recalculates the reference speeds in an interval of 0.5

s, until the characteristic errors are smaller than the

following thresholds (in pixels): (|error_u| = 10, -20

COORDINATION OF A PROTOTYPED MANIPULATOR BASED ON AN EXPERIMENTAL VISUO-MOTOR

MODEL

307

< error_v_left < 0, |error_disp| = 8). Thus, the end-

effector gets close to the object, and the grasping

step starts, with the end-effector moving towards the

object at the same time its claw starts closing.

4 EXPERIMENTAL RESULTS

Aiming at an initial evaluation of the proposed

strategy, it was programmed in the experimental

platform. How the image characteristics associated

to the end-effector and to the object vary in the

image planes, for an experiment for estimating the

visuo-motor model (“Perception” mode), is shown in

Figure 3. Figure 4, by its turn, presents these

movements for an experiment in the “Action” mode.

The initial positions of the interest pictures were

denoted by circles. The displacements of the end-

effector and of the object take place in an alternate

way for the "Perception" mode (Figure 3), and

simultaneously for the "Action" mode (Figure 4).

For the presented coordination experiment, the

curves showing how the characteristic errors very

are shown in Figure 5. The curves representing the

calculated speeds, by their turn, are shown in Figure

6, with the curves of the approximated speeds. It can

be observed, especially in the beginning of this

experiment, that the depth of the end-effector is

modified by the movements of the joints J2 and J3,

making the disparity error to increase. This variation

is corrected along the experiment.

5 CONCLUSIONS AND FUTURE

WORK

In this paper a strategy to incrementally build a

visuo-motor model for a manipulator with an

uncalibrated binocular vision system is proposed.

The main points of this proposition are the form the

coefficients of the visuo-motor transformation

matrix (Ĥ) are estimated and the segmentation of the

space spanned by a group of image features in

smaller areas. As a consequence of both such a

partition and the fact that the cameras are fixed to

the manipulator base, it resulted a data structure

related to the end-effector, intended to store the

coefficients ((h

11

, h

21

, h

31

) and (h

12

, h

22

, h

32

)) of Ĥ for

each cell, and other data structure, addressed by the

object position, to store (h

13

, h

23

, h

33

). In the

"Perception" mode, just an articulation is

commanded each time, and the estimated

coefficients are used to continuously update the

stored values. The two data bases are stored in files,

so that they can be used from an experiment to other.

In the "coordination" mode, by its turn, the

articulations are moved simultaneously, using the

visuo-motor model previously obtained.

The results so far obtained show that it is indeed

possible to coordinate the motion of the manipulator

joints, using such approach, in order to get closer to

an object and to grasp it.

Figure 3: The displacement of the end-effector and the object for an experiment in the "Perception" mode, and the

splitting of the spaces spanned by the image features associated to the end-effector and to the object

ICINCO 2005 - ROBOTICS AND AUTOMATION

308

REFERENCES

Dâmaso, R. S., Campos, T. P. R., Bastos-Filho, T. F. and

Sarcinelli-Filho, M., 2004. Coordenação Visuo-

Motora de um Manipulador Experimental: Uma

Abordagem Reativa, In XV Congresso Brasileiro de

Automatica, Gramado, Brazil, in CD (in Portuguese).

Dâmaso, R. S., Carelli, R., Bastos-Filho, T. F. and

Sarcinelli-Filho, M., 2003. Controle Servo Visual de

um Manipulador Industrial Auxiliado por Visão

Binocular. In VI Simpósio Brasileiro de Automação

Inteligente, Bauru, Brazil, pp.799-803 (in Portuguese).

Graefe, V., 1995. Object- and Behavior-oriented Stereo

Vision for Robust and Adaptive Robot Control,

International Symposium on Microsystems, Intelligent

Materials, and Robot, Sendai, Japan, pp. 560-563.

Hollinghurst, N., Cipolla, R., 1994. Uncalibrated Stereo

Hand-Eye Coordination, Image and Vision

Computing, vol.12(3), pp. 187-192.

Hosoda, K., Asada, M., 1997. Adaptive Visual Servoing

for Various Kinds of Robot Systems, V International

Symposium on Experimental Robotics (V ISER),

Barcelona, Spain, pp. 451-462.

Hutchinson, S., Hager, G., Corke, P., 1996. A Tutorial on

Visual Servo Control, IEEE Transactions on Robotics

and Automation, vol.12, pp. 651-670.

Xie, Q., Graefe, V., Vollmann, K., 1997. Using a

Knowledge Base in Manipulator Control by

Calibration-Free Stereo Vision, IEEE International

Conference on Intelligent Processing Systems, China.

Figure 6: Calculated (thin lines) and approximate speeds

(thick lines) for an example in “Coordination” mode

Figure 4: The displacements of the end-effector and the object for an experiment in the "Coordination" mode

Figure 5: Error evolution for an example i

n

“Coordination” mode

COORDINATION OF A PROTOTYPED MANIPULATOR BASED ON AN EXPERIMENTAL VISUO-MOTOR

MODEL

309