Motion Direction Detection from Segmentation by
LIAC, and Tracking by Centroid Trajectory Calculation
Antonio Fernández-Caballero
Escuela Politécnica Superior de Albacete, Departamento de Informática
Universidad de Castilla-La Mancha, 02071 – Albacete, Spain
Abstract. Motion information can form the basis of predictions about time-to-
impact and the trajectories of objects moving through a scene. Firstly, a model
that incorporates accumulative computation and lateral interaction is presented.
By means of the lateral interaction in accumulative computation (LIAC) of
each element with its neighbours, the model is able to segment moving objects
present in an indefinite sequence of images. In a further step, moving objects
are tracked using a centroid-based trajectory calculation.
1 Motion Direction Detection
Motion plays an important role in our visual understanding of the surrounding envi-
ronment [1]. Visual motion can aid in the detection of shape [2], provide information
as to the relative depth of moving objects [3], and give clues about the material prop-
erties of moving objects, such as the rigidity and transparency [4]. Motion informa-
tion can also form the basis of predictions about time-to-impact and the trajectories of
objects moving through a scene [5]. This paper introduces a novel method for motion
direction detection based on segmentation by lateral interaction in accumulative com-
putation (LIAC) and tracking by centroid trajectory calculation.
1.1 Segmentation from LIAC
The aim of segmentation step is firstly to determine in what grey level stripe a given
element (x,y) falls. Let GLS (x,y,t) be the grey level stripe of image pixel (x,y) at time
t and n the total number of grey level stripes.
[]
256,1,
256
),,( = n
n
tyxGLS
(1)
Lateral interaction in accumulative computation is capable of modelling the motion
on the image, starting from the pixel grey level stripe and the element state or perma-
nence value. There are as many permanence values for a given element as grey level
stripes. At each time instant t, the permanence value is obtained in two steps. (1) A
charge or discharge due to the motion detection, that's to say, due to a change in the
grey level stripe, and, (2) a re-charge due to the lateral interaction on the partially
Fern
´
andez-Caballero A. (2005).
Motion Direction Detection from Segmentation by LIAC, and Tracking by Centroid Trajectory Calculation.
In Proceedings of the 5th International Workshop on Pattern Recognition in Information Systems, pages 213-218
Copyright
c
SciTePress
charged elements that are directly or indirectly connected to maximally charged ele-
ments. The charge or discharge behaviour of the permanence memory is explained
next. (a) All permanence values not associated to grey level stripe k are completely
discharged down to value v
dis
. (b) If the pixel associated to the element is enclosed in
grey level stripe k, we are in front of two different possibilities. (b.1) If the pixel was
not enclosed in grey level stripe k in time t-1, permanence memory is completely
charged up to the maximum value v
sat
, or, (b.2) if the pixel was previously enclosed in
grey level stripe k in time t-1, permanence memory is applied a decrement of value
v
dm
(discharge value due to motion detection), down to a minimum of v
dis
.
==
=
=
ktyxGLSandktyxGLSif
vvtyxkPM
ktyxGLSandtyxGLSifv
ktyxGLSifv
tyxkPM
disdm
sat
dis
)1,,(),,(
),,)1,,,(max(
)1,,(1),,(,
),,(,
),,,(
(2)
If the element is charged to the maximum, it informs its neighbours through the
channels prepared for this use. This is the way a re-charge of the permanence value
due to lateral interaction by a value v
rv
(charge value due to vicinity) can now be
performed. This functionality is biologic and can be seen as an absolute refractory
period adaptive mechanism. Obviously, the permanence memory cannot be charged
over the maximum value v
sat
. Note that this is the way the system is able to maintain
our attention on an element, just because it is connected to a maximally charged ele-
ment up to l pixels away, and the false background motion is eliminated.
),),,,((min),,,(
satrv
vvtyxkPMtyxkPM
+
=
ε
, (3)
where
(4)
>=
>=
>+=+
>+=+
=
otherwise
vtyjxkPMvtyixkPM
vtjyxkPMvtiyxkPM
vtyjxkPMvtyixkPM
vtjyxkPMvtiyxkPM
ijliif
dissat
dissat
dissat
dissat
,0
)),,,(),,,((
)),,,(),,,((
)),,,(),,,((
)),,,(),,,((
)1(!)1(,1
ε
1.2 Centroid Trajectory Calculation
The last step consists in obtaining the trajectory of the objects by spatio-temporally
calculating the centroid (X
obj
, Y
obj
) of the maximally charged pixels of the moving
objects. Fig. 1 graphically shows the calculation of the centroid of an object.
Therefore the size is defined starting from the longitude of two right lines (or
cords) determined by four well-known pixels of the surface of the object [6]. The
pixels referenced this way are (x
1
, y
1
), (x
2
, y
2
), (x
3
, y
3
) and (x
4
, y
4
), such that:
yyyyxxxxtjiSyx >
<
>
<
4321
,,,),,,(),( (5)
In other words, the four pixels are:
(x
1
, y
1
) : pixel most at the left of the object in the image
(x
2
, y
2
) : pixel most at the right of the object in the image
(x
3
, y
3
) : upper most pixel of the object in the image
(x
4
, y
4
) : lower most pixel of the object in the image
The two cords denominated maximum line segments of the object, won't unite the
214
pixels (x
1
, y
1
) and (x
2
, y
2
), (x
3
, y
3
) and (x
4
, y
4
) to each other, but rather their projec-
tions (X
1
, 0) and (X
2
, 0), (0 , Y
3
) and (0 , Y
4
), respectively, as you can appreciate in
Fig. 1.
X
2
X
1
Y
3
Y
4
(X
obj
, Y
obj
)
Fig. 1. Centroid of an object
Now, the object’s location will be determined by a unique characteristic pixel (X
obj
, Y
obj
), that is to say, the intersection of the two segments (X
1
, Y
3
)(X
2
, Y
4
) and (X
2
,
Y
3
)(X
1
, Y
4
). This centroid pixel will be denominated representative pixel of the object.
Once the maximum line segments and the representative pixel of an object have
been obtained in a sequence of images, it should be rather simple to detect a lot of
motion cases [7],[8]. Anyway, considering the following possibilities: no motion (N),
translation in X or Y-axis (T), dilation, or translation in Z-axis (D), and, rotation (R),
we may only obtain, by combining them, the following possibilities:
N no motion TDR translation + dilation + rotation
T pure translation D pure dilation
TD translation plus dilation DR dilation plus rotation
TR translation plus rotation R pure rotation
It is considered that the previous states appear in most cases (Fig. 2). Fig. 2 shows
the different possibilities when no change is detected in the representative pixels co-
ordinates enclosed in brackets. When there is a significant change in the co-ordinates
of the representative pixel, a T has been added enclosed in parenthesis. In this graph:
(1) Comparison between the horizontal maximum line segments of previous (k-1)
and current (k) image
(6)
<
=
>
lXXXXifSmaller
lXXXXifEqual
lXXXXifLarger
kk
kk
kk
11212
11212
11212
)()(,
)()(,
)()(,
where l is the maximum permitted difference.
(2)
Comparison between the vertical maximum line segments of previous (k-1) and
current (k) image
(7)
<
=
>
lYYYYifSmaller
lYYYYifEqual
lYYYYifLarger
kk
kk
kk
13434
13434
13434
)()(,
)()(,
)()(,
where l is the maximum permitted difference.
(3)
Similitude degree between the scale change of the maximum segments of im-
ages at k-1 and k
215
+
otherwiseDifferent
YY
YY
XX
XX
ifSimilar
k
k
k
k
,
1
)(
)(
)(
)(
1,
134
34
112
12
αα
(8)
where
α
is the permitted fluctuation in the similitude function.
(4)
Result state if the representative pixel of the object has not changed substan-
tially; a non substantial change is obtained by means of the following algorithm
)()(
11
dYYdXX
kobjkobjkobjkobj
(9)
where d is the maximum permitted displacement.
(5)
Result state if the representative pixel of the object has changed substantially.
Of course, the possibility to offer some erroneous results with an unknown error
rate is assumed, especially in front of some rotation examples. Nevertheless, if the
number of images in a sequence is great enough, this error rate should be very little.
(1) (2) (3) (4) (5)
Similar [D] (TD)
Larger Different [DR] (TDR)
Larger Equal [R] (TR)
Smaller [R] (TR)
Larger [R] (TR)
Equal Equal [N] (T)
Smaller [R] (TR)
Larger [R] (TR)
Smaller Equal [R] (TR)
Smaller Similar [D] (TD)
Different [DR] (TDR)
Fig. 2. Motion states graph
2 Some Illustrative Examples
The algorithms exposed previously have been applied to multitude of synthetic se-
quences as shown in Fig. 3.
In example 1, a pure translation in one of the axis, in particular in the y-axis, is
shown. The algorithms work perfectly in this easy case (outputs is always D), even in
presence of the gleaned form of the treated object. Pure translations in the three axes
x, y , z have been all tested with this same and other synthetic objects. They have
offered the same good results in their behaviour. Evidently, it was expected that this
simple case had to work that well.
The second example is representative of more complex translation movements.
Here there are simultaneous translations in several axes. All possible translation com-
binations have been tested, obtaining for all the analysed objects an excellent behav-
iour of the algorithms. This concrete example offers the translation motion of an
216
irregular form in the three axes in a simultaneous way. That is the reason why the
correct result TD appears in all twenty steps of the approached synthetic sequence.
As it was easy to foresee, the problems would begin when incorporating rotational
movements. Example 3 is a sample of it. Indeed, we are in front of the case of a cube
approaching on z-axis and rotating simultaneously. Notice that the algorithm does not
throw the desired result DR, but a simple D, in the simulations. The explanation has
to be looked for in the shape of the object. Indeed, the algorithm works so much bet-
ter for the case of rotations the more irregular the shape of the analysed object’s mo-
tion is. Unfortunately, the horizontal and vertical segments always have the same
value for this figure.
Example 4 analyses a similar motion to the one of example 3. Here, nevertheless,
we are in front of an irregular shape. So it was waited to get a better behaviour of the
algorithms exposed in this work. And, indeed, good results are obtained from the
analysis of image 9 on of the sequence. The explanation of why the first images do
not throw the desired result is in the value chosen for the allowed fluctuation (
α
=0,2)
in the similitude function of the motion evaluation graph. A smaller value allows
improving the previous results.
(a) (b)
(c) (d)
Fig. 3. (a) Thumbtack sequence frames 1 and 20. (b) Irregular form sequence frames 1 and 20.
(c) Cube sequence frames 1 and 20. (d) Lamp sequence frames 1 and 20
4 Conclusions
In this paper, the LIAC model to spatio-temporally segment a moving object present
in a sequence of images has been introduced. In first place, this method takes advan-
tage of the inherent motion present in image sequences. This object segmentation
method may be compared to background subtraction or frame difference algorithms
in the way motion is detected. Then, a region growing technique is performed to
define the moving object. In contrast to similar approaches no complex image pre-
processing must be performed and no reference image must be offered to this model.
The method facilitates any higher-level operation by taking advantage of the common
charge value of parts of the moving object.
217
That is reason why it is so easy to introduce a simple but effective method for ob-
ject tracking. To some extent, the line of research on tracking using interframe match-
ing and affine transformations has been followed. Similarly to [9], the method de-
pends on the assumption that the image structure constrains sufficiently reliable mo-
tion estimation. Firstly, the detection of an important parameter of an object in move-
ment (its size) has been presented in this context. The algorithm is based on centroid
tracking [10]. Lastly, comparing the results obtained in the previous stage towards a
general graph for motion cases performs tracking. Compared to other approaches
based on geometric properties, the method proposed assumes that the images in the
sequences have a small transformation between them. Small changes over small re-
gions are also assumed. In this approach, the number of tracking features is kept to a
minimum. This permits to control one of the most important issues in visual systems:
time.
Acknowledgements
This work is supported in part by the Spanish CICYT TIN2004-07661-C02-02 grant.
References
1. D. Hogg, “Model-based vision: A program to see a walking person”, Image and Vision
Computing, vol. 1, no. 1, pp. 5-20, 1983.
2. J.L. Barron, D.J. Fleet, S.S. Beauchemin, “Performance of optical flow techniques”, Inter-
national Journal of Computer Vision, vol. 12, no. 1, pp. 43-77, 1994.
3. R. Jain, W.N. Martin, J.K. Aggarwal, “Segmentation through the detection of changes due
to motion”, Computer Graphics and Image Processing, 11, pp. 13-34, 1970.
4. M.A. Fernández, J. Mira, “Permanence memory: A system for real time motion analysis in
image sequences”, in IAPR Workshop on Machine Vision Applications, MVA’92, 1992,
pp. 249-252.
5. T.S. Huang, A.N. Netravali, “Motion and structure from feature correspondences: A re-
view”, Proceedings of the IEEE, 82, pp. 252-269, 1994.
6. R. Deriche, O.D. Faugeras, “Tracking line segments”, Image and Vision Computing, vol. 8,
no. 4, pp. 261-270, 1990.
7. A. Fernández-Caballero, Jose Mira, Ana E. Delgado, M.A. Fernández, “Lateral interaction
in accumulative computation: A model for motion detection”, Neurocomputing, 50C, 2003,
pp. 341-364.
8. A. Fernández-Caballero, J. Mira, M.A. Fernández, A.E. Delgado, “On motion detection
through a multi-layer neural network architecture”, Neural Networks, 16 (2), 2003, pp.
205-222.
9. M. Gelgon, P. Bouthemy, “A region-level motion-based graph representation and labeling
for tracking a spatial image partition,” Pattern Recognition, 33 (4), 2000, pp. 725-740.
10. G. Liu, R. M. Haralick, “Using centroid covariance in target recognition,” Proceedings
ICPR98, 1998, pp. 1343-1346.
218