BIOLOGICALLY INSPIRED ATTENTIVE MOTION ANALYSIS FOR

VIDEO SURVEILLANCE

Florian Raudies and Heiko Neumann

Institute of Neural Information Processing, University of Ulm, 89069 Ulm, Germany

Keywords:

Attention, Video Surveillance, Motion Streaks, Image Flow, Recurrent Grouping.

Abstract:

Recently proposed algorithms in the ﬁeld of vision-based video surveillance are build upon directionally con-

sistent ﬂow (Wixson and Hansen, 1999; Tian and Hampapur, 2005), or statistics of foreground and back-

ground (Ren et al., 2003; Zhang et al., 2007). Here, we present a novel approach which utilizes an attention

mechanism to focus processing on (highly) suspicious image regions. The attention signal is generated through

temporal integration of localized image features from monocular image sequences. This approach incorpo-

rates biologically inspired mechanisms, for feature extraction and spatio-temporal grouping. We compare our

approach with an existing method for the task of video surveillance (Tian and Hampapur, 2005) with a re-

ceiver operator characteristic (ROC) analysis. In conclusion our model is shown to yield results which are

comparable with existing approaches.

1 INTRODUCTION

Video surveillance is a recent ﬁeld of research ad-

dressing the tasks of detection, localization, recog-

nition, and tracking of speciﬁc objects. Existing ap-

proaches are based upon the following assumptions

to detect attentional regions in spatio-temporal im-

age sequences: (i) directionally consistent image ﬂow

were used to separate coherent object movement from

spatio-temporal ﬂuctuations of scene events (Wixson

and Hansen, 1999; Tian and Hampapur, 2005), or

(ii) temporally non-deformable image features were

matched between subsequent frames (Zhou and Ag-

garwal, 2001). In contrast, our model utilizes moving

features which could be further updated over time due

to an increasing gain of evidence for the presence of

the spatio-temporal structure or event.

Several mechanisms in our model are motivated

by neurophysiological evidence. At ﬁrst, simple fea-

tures are extracted, according to the early visual pro-

cessing in area V1 (Hubel and Wiesel, 1968). These

features are temporally differentiated to extract on-

and offset of temporal changes in the image struc-

ture (Marr and Ullman, 1981). The result of feature

extraction and differentiation are then integrated and

temporally smoothed which in turn leads to a lower

temporal signal resolution. This result is referred to

as streak image, representing traces from temporal

changing features. For the extraction of activity for

a speciﬁed orientation of those traces, we employ Ga-

bor ﬁlters (Daugman, 1988). Grouping of these activ-

ities is realized by long-range interaction ﬁlters uti-

lizing a biologically inspired scheme (Neumann and

Sepp, 1999). Motion information is processed along

a pathway parallel to the form features. This division

of segregated form and motion processing is reminis-

cent of the ventral and the dorsal stream in cortical

visual processing in primates.

The main contributions of our model architecture

are: (i) the formulation of a biologically inspired

model for the task of video surveillance, (ii) the con-

struction of a motion streak image, integrating only

salient temporally changing features, (iii) a general

grouping mechanism, and (iv) feature binding at vari-

ous stages within the model. To the best of our knowl-

edge, this is the ﬁrst model which incorporates motion

streaks as an attentional signal and furthermore con-

structs salient streak representations, which contain

no temporally static features. Additionally, the com-

bination of three attentive signals is new in the ﬁeld of

video surveillance. Processing in two parallel path-

ways is motivated by the representation of changes

on two temporal scales, whereas motion streaks in-

tegrate over several time steps, and image ﬂow con-

tains mostly information of two temporally subse-

quent snapshots. These two temporal scales combine

645

Raudies F. and Neumann H. (2008).

BIOLOGICALLY INSPIRED ATTENTIVE MOTION ANALYSIS FOR VIDEO SURVEILLANCE.

In Proceedings of the Third International Conference on Computer Vision Theory and Applications, pages 645-650

DOI: 10.5220/0001078806450650

 SciTePress

long-time motions with actually motions in an appro-

priate way to identify attentional regions, which is one

task of surveillance.

Figure 1: Schematic overview of the model. Details about

the functionality of all parts are reported in Sec. 3.

2 OVERVIEW OF THE MODEL

This model computes an attention signal from an im-

age sequence I

,...,I

t−n+1

of n frames. The streak im-

age is the result of a temporal integration of all frames

while the image ﬂow is estimated on the basis of the

two most recent frames. These two mechanisms form

four signal channels: (i) the intensity of the motion

streaks, (ii) the speciﬁc texture of the motion streak

image, (iii) the direction of image ﬂow, and (iv) the

magnitude of image ﬂow. At the very beginning an

activity coding is constructed, for orientations in the

texture channel and for directions of the image ﬂow.

Subsequently, a grouping mechanism is employed to

further process the activity code of the texture chan-

nel. This mechanism enforces aligned structures get-

ting the same orientation. An incorporation of both

activity codes is realized by a multiplication. The re-

sult contains information about the direction of fea-

ture displacements, which is then transformed into a

coherence signal. Finally, again a multiplication of

this coherence signal together with the streak intensity

signal and motion magnitude signal results in the ﬁnal

attention signal. Fig. 1 gives an schematic overview

about the processing pathways of the signal channels.

3 SPECIFIC MECHANISMS

Motion Detection and Integration. For the estima-

tion of image ﬂow we used the algorithm of Lucas &

Kanade (Lucas and Kanade, 1981), with a window

size of 21. Spatial derivatives are calculated using

the discrete kernel [−1, 8,0,−8,1]/12 according to

(Barron et al., 1994). Further processing uses a po-

lar representation of the image ﬂow with magnitude

R and direction Ψ. For the interaction with the streak

texture signal a population code for the direction Ψ of

motion is constructed by a rectiﬁed cosine-tuning

motion,dir

= max(cos(Ψ− ψ), 0), ψ = 0,...,15/8π (1)

for each sampling direction ψ. Here the sampling

is equidistant and is deﬁned for sixteen directions, an

example is shown in Fig. 2.

Temporal Change Detection and Motion Streaks.

The computation of salient motion streaks starts with

the extraction of corner features, resulting in a rep-

resentation of more distinctive and localized image

properties. These features are detected with the

F¨orstner corner detector (F¨orstner, 1986). In a prepro-

cessing stage each image is smoothed with a Gaus-

sian kernel (σ

pre

=1.0). The structure tensor is com-

puted by calculating the vector product of the inten-

sity gradient that is averaged over a local neighbor-

hood (smoothing with a Gaussian kernel σ

tensor

=0.5).

Let λ

and λ

denote the eigenvalues of this tensor,

then a continuous valued corner is characterized by

F = λ

/(λ

+ λ

F = F ∗ G

gauss

. (2)

These corner features are temporally smoothed with a

Gaussian (σ

= 2.0), which suppresses temporal noise

in the input sequence. A logarithmic transformation is

applied to enhance features with low response ampli-

tudes, resulting in

space

. For an integration of mov-

ing features a temporal derivative

time

(x,y;t) ≈

space

(x,y;t) ∗ [1, −1](t) (3)

of each image is additionally calculated. Together

with the previoustemporal smoothing this mechanism

eliminates pure static image features. The conjunc-

tion of feature representations

space

and

time

given

comb

space

time

results in a high spatial reso-

lution and a sufﬁcient temporal resolution of features.

Temporal integration of the combined signals is real-

ized by the weighted sum

S =

1− (1− α)

n−1

∑

i=0

(1− α)

comb

(x,y;t − i), (4)

where S denotes the motion streak image. The

parameter α avoids an inﬁnite temporal integration

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

646

(α=0.05). In addition to the intensity image the streak

image contains a speciﬁc texture generated by moving

corner features. An analysis of this texture is realized

by oriented derivatives of the streak image S

streak,ori

= S ∗ G

gabor

φ,σ,λ

, φ ∈ [0,π) (5)

computed with a Gabor ﬁlter approach (DC level

free). The Gabor ﬁlters are parameterized by eight

orientations φ, with ∆φ = π/8, one scale σ, and wave-

length λ (σ=4, λ=6px).

Grouping Mechanism. Grouping is performed for

the motion streak activities A

streak,ori

serving as input

activity A

, and the result of grouping A

out

is further

referred to as

streak,ori

. The grouping method em-

ployed is a simpliﬁed version of a mechanism that has

been proposed to account for the neural mechanisms

of surface boundary formation and the extraction of

invariant surface features (Weidenbacher et al., 2006;

Neumann and Sepp, 1999). The method consists of

four steps forming an iterative loop (for notational

simpliﬁcation we omit parameters denoting 2-D spa-

tial location):

ori

= A

(1+ cA

out

) (6)

ori

= x

ori

/(µ+

∑

ori

) (7)

group

= y

ori

∗ G

bipol

φ,σ

,σ

(8)

out

= x

group

/(µ+

∑

group

). (9)

In Eq. 6 the initial input signal A

is nonlinearly en-

hanced by the feedback term (1 + cA

out

), with the

feedback constant c (c=100). This modulatory cou-

pling of feedback guarantees stability, therefore only

existing feature activities at given location and ori-

entation are enhanced by feedback, since the term

1+ cA

out

is gated by the driving signal activation A

However, feedback alone cannot generate any feature

activities. Substages at the levels of orientation and

grouping computation (Eq. 7 and Eq. 9) consist of a

normalization step to keep activities within bounds.

The parameter µ affects the inﬂuence of the total acti-

vation in the normalization process (µ = 10

−2

). Eq. 8

realizes the propagation (grouping) of orientational

responses along their corresponding orientation axis

φ. In this equation, G

bipol

φ,σ

,σ

denotes a bipolar ﬁlter

consisting of two Gaussian kernels which are spa-

tially offset along the orientation axis at the target lo-

cation. Each kernel has an elongation σ

for the major

axis, σ

for the minor axis, and orientation φ (σ

=3.5,

=1.5). These kernels are combinedto form a group-

ing ﬁlter along the orientation φ. In Eq. 9 a normal-

ization similar to Eq. 7 is applied. To quantify the

improvement caused by the iterative grouping mech-

anism we report the orientation error for a scene with

existing ground-truth motion, shown in Fig. 4(B).

Combination of Motion Responses with Streaks

and Attentional Signal. A general motivation for

the need of an attentional signal in many applica-

tions of computer vision is given in (Rothenstein and

Tsotsos, 2007), suggesting that an attentional sig-

nal reduces the complexity for visual search tasks.

Here, the activity of motion direction is multiplica-

tively combined with the grouped activity of motion

streaks, each serving as an attentional signal. For

an combination activities corresponding to orienta-

tions are replicated

streak,dir

= {

streak,ori

φ+π

by this means that one orientation votes for the two

corresponding directions. The ﬁnal multiplication

comb,dir

= A

motion,dir

streak,dir

are motivated for two

reasons: (i) activities corresponding to a speciﬁc

direction from streaks and motion are independent

sources, (ii) streak activity acts as a bias for motion

activity or vice versa.

Coherence and Attentional Signals. The combined

activity is used for the computation of a coherence

signal, based on large patches with similar move-

ments. Therefore, a maximum likelihood estimate is

calculated from the ensemble (population) of activi-

ties (Deneve et al., 1999), given by





∑

comb,dir



cosψ

sinψ



∑

comb,dir

, (10)

with the direction Ψ

comb,dir

= arctan2(y,x) ∈ (−π,π],

where arctan2 denotes the four quadrant inverse tan-

gent. Coherent motion

C =k



cos(Ψ

comb,dir

) ∗ G

coh

sin(Ψ

comb,dir

) ∗ G

coh



(11)

is detected by large integrating Gaussian ﬁlter kernels

(σ

coh

=11.25). As in Fig. 1 all three attentive signals,

namely the ﬂow magnitude R, the streak intensity S,

and the coherence of directions C, are normalized and

multiplied to provide the ﬁnal attentional signal. This

multiplication is a simple mechanism to avoid addi-

tional complexity. The mechanism could be extended

by incorporating weighing factors for the three atten-

tional signals.

4 PROCESSING RESULTS AND

EVALUATION

Results are presented for motion detection and for

motion streak integration. Subsequently, we present

BIOLOGICALLY INSPIRED ATTENTIVE MOTION ANALYSIS FOR VIDEO SURVEILLANCE

647

an evaluation of our model for the task of video

surveillance.

Motion Detection.Here, results for the Hamburg

Taxi Sequence are shown (http://i21www.ira.uka.de,

03/2007). In this scene the background ﬂickers, and

the main attentive motion signals which should be de-

tected are the movement of the cars and the pedes-

trian. The activity code of the ﬂow ﬁeld is shown in

Fig. 2. Activity representing the motion of the ve-

hicles are present in several maps, due to the broad

tuning. For example, the car approaching from the

left side induces activity in four neighboring maps,

according to the directions ψ = 0,π/8, 14π/8, 15π/8

(compare with Fig. 2).

Figure 2: Motion activity for the Hamburg Taxi Sequence.

Activities are normalized and linearly coded from black

(zero activity) to white (unit activity).

Motion Streak Integration. Fig. 3(A) shows the last

frame of the Hamburg Taxi sequence and Fig. 3 (B)

shows the streak image. Additionally, lines are su-

perimposed on this streak image, where the length re-

ﬂects the streak intensity and the orientation results

from the texture analysis. Therefore, the activities

streak,ori

corresponding to orientations φ are inter-

preted as described in Eq. 10, where directions ψ

are substituted by orientations φ. The resulting ori-

entation Φ deﬁnes a vector (cos(Φ),sin(Φ)) which

is weighted with the streak intensity S, forming the

lines.

(A)

(B)

Figure 3: Motion streak image: (A) Last frame of the Ham-

burg Taxi sequence. (B) Temporally integration of fea-

tures produces the streak image, where the extracted and

grouped orientation of the streaks is superimposed by lines

(41 frames, 10 iterations, sampled 15 times).

(A)

1 2 3 4 5 6 7 8 9 10

iteration

orientation error [°]

(B)

mean

median

Figure 4: Motion streaks for Yosemite sequence with

clouds: (A) Streak lines with superimposed extracted ori-

entations from texture and grouping (15 frames used for in-

tegration, 10 iterations, sampled 15 times). (B) Decreasing

error over iterations of grouping.

Figure 5: Masked last frame of sequences: (A) Durlacher

Tor. (B) Indoor. Attentional regions are plotted with full

intensity and the background with an intensity of at most

thirty percent.

In general the construction of a motion streak im-

age and the analysis of the streak texture for orien-

tations is not restricted to object motion. Thus, we

present results for an ego-motion sequence, a simu-

lated ﬂight through the valley of the Yosemite park.

Temporal integration results in the motion streak im-

age in Fig.4(A). From this characteristic streak tex-

ture the orientation is determined and former reﬁned

by grouping. The ﬁnal result (lines) is superimposed

to the streak texture. To show the effect of iterative

grouping the mean and median orientation error are

reported in Fig. 4 (B). This orientation error is deﬁned

as the deviation between the ground-truth orientation

and the estimated orientation. After only few itera-

tions (approx. ﬁve) of recurrent grouping this error

saturates at the level of 3.9 deg median and 6.3deg

mean.

Evaluation for Video Surveillance. For video

surveillance an evaluation based on four different se-

quences has been conducted (results are shown only

for two of them). In this evaluation regions with atten-

tive motion should be correctly detected which refer

to suspicious activity. Accordingly, those regions are

masked in the last frame of the processed sequence,

shown in Fig. 5. The ﬁrst scene is a characteristic

trafﬁc scenario, the second a typical indoor sequence

(from (Brown et al., 2005)).

For the evaluation of robustness receiver operator

characteristics (ROC) with decisions at pixel level are

investigated. Within this analysis the false positive

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

648

ratio is deﬁned as FPR =k N

det

attentive

\ N

attentive

k / k

inattentive

k where N

det

attentive

is the set of pixels de-

tected as salient, N

attentive

is the set of salient pix-

els according to the masked regions, and N

inattentive

for inattentive pixels. The numerator contains the set

difference between N

det

attentive

and N

attentive

. A FPR of

zero indicates that no inattentive pixel is detected as

salient. The true positive ratio is deﬁned as TPR =k

det

attentive

k / k N

attentive

k. In our model the attentional

signal is ﬁnally thresholded, as the only decision pa-

rameter that needs to be adjusted in the algorithm.

In the algorithm of (Tian and Hampapur, 2005) the

parameter T

that is applied to the accumulated tem-

poral difference image is critical and is therefore se-

lected as reference parameter in the ROC analysis.

Additionally, the parameter W

accum

for the weighted

summation of the accumulated temporal difference

seems critical. For this reason we applied the analysis

for two representative values W

accum

= 0.5 (as stated

in (Tian and Hampapur, 2005)) and W

accum

= 0.125.

In Fig. 6 results for the analysis are shown. Be-

side this, results for the attentive signals from streak

saliency

S, motion saliency

R, saliency from coher-

ence

C, and their multiplicative conjunction are vi-

sualized, where the symbol ’tilde’ imposes that the

signals are scaled to the interval [0,1]. For the trafﬁc

scene (Fig. 6(A)) our model performs better than that

of (Tian and Hampapur, 2005) and in the second se-

quence our model performs better for speciﬁc ratios

of FPR/TPR. Another measure is the area under the

ROC curve, where the our model has the best perfor-

mance value for the trafﬁc sequence (0.982) and for

the indoor sequence (0.970). For the method of (Tian

and Hampapur, 2005) the calculation of the area un-

der the ROC curve is not possible due to the two vari-

able thresholds W

accum

and T

5 DISCUSSION

The model proposed in this contribution follows the

main idea of (Tsotsos et al., 2005) for the decom-

position of an image sequence into single features

(here the channels intensity, texture, ﬂow direction,

and magnitude) and their re-combination into an at-

tention signal. In contrast to the approach of (Tsotsos

et al., 2005), which suggests a feed-forward process-

ing pyramid, our model contains feedback signals in

the grouping stage.

For the construction of motion streaks a related

idea was reported in (Majchrzak et al., 2000). Unlike

to our approach the authors simply add the images in

a speciﬁc time window. Then they extract edges in

the resulting motion streak image. On the basis of

0 0.05 0.1 0.15 0.2 0.25

0.2

0.4

0.6

0.8

False Positive Ratio

True Positive Ratio

(A)

Tian W

accum

=1/2

Tian W

accum

=1/8

Neural Model (0.982)

Streaks S

(0.934)

Magnitude R

(0.968)

Coherence C

(0.933)

0 0.05 0.1 0.15 0.2 0.25

0.2

0.4

0.6

0.8

False Positive Ratio

True Positive Ratio

(B)

Tian W

accum

=1/2

Tian W

accum

=1/8

Neural Model (0.970)

Streaks S

(0.913)

Magnitude R

(0.957)

Coherence C

(0.912)

Figure 6: ROC analysis: (A) Durlacher Tor (frames 11-21).

(B) Indoor, (frames 45-55, parameter σ

pre

=2). Our model

constructs the attentive signals streaks

S, magnitude

R, co-

herence

C, and a combination denoted by the neural model.

The x-axis ranges from 0 to 0.25. Values in brackets denote

the area under the ROC curve.

the orientation of these edges they deﬁne a decision

mechanism to differentiate between a rotational ﬁeld,

sidewards translation, or a translation near the line of

sight (FOE/FOC). Compared to our method, the fea-

ture extraction and subsequent weighted integration

results in more salient streaks. For this speciﬁc streak

texture we then extracted and grouped orientations. A

heading estimation on the basis of these orientations

as proposed by (Majchrzak et al., 2000) is also possi-

ble.

Our model behaves robust compared to assump-

tions in video surveillance: First, the assumption of

directionally-consistent movements or coherent tem-

poral motion (Wixson and Hansen, 1999; Tian and

Hampapur, 2005), which is here included within the

temporal integration for the construction of the streak

image. If this assumption is not fulﬁlled the streak in-

tensity and ﬂow magnitude supports an attention sig-

nal. Second, non-stationary backgrounds (like wig-

gling trees, waves, fountains, rain, snow) are as-

sumed. Generally, those backgrounds are outlined

through spatial and temporal statistics (Ren et al.,

2003; Zhang et al., 2007). Within this predictive

mechanism only the most important variations are

captured through a sub-space analysis of the input se-

quence. Our method outlines distracting background

movements by temporal smoothing and integration.

BIOLOGICALLY INSPIRED ATTENTIVE MOTION ANALYSIS FOR VIDEO SURVEILLANCE

649

6 CONCLUSIONS

The proposed model combines information from mo-

tion streaks and image ﬂow, forming the signals mag-

nitude of image ﬂow, intensity of motion streaks, and

coherence of motion directions. These three signals

are combined for the ﬁnal attention signal. An ROC

analysis for scenarios of video surveillance is con-

ducted and shows that three attentive signals and the

ﬁnal signal are appropriate for the division between

attentive motion and background noise. Compared

with the method of (Tian and Hampapur, 2005) our

model has only one critical threshold and shows bet-

ter results for two analyzed scenes. Main challenges

solved by our model are the robust processing and

analysis of noisy background and locally incoherent

motions of walking persons.

Within our model motion streaks are used, which

provide information about orientation and speed for

object and self-motion. In addition to the approach

of (Majchrzak et al., 2000) our model provides a

dense ﬁeld of orientations and speeds. Therefore, mo-

tion streaks could serve as a robustprior or bias for the

estimation of image ﬂow and also for the estimation

of ego-motion. Future work will pursue incorpora-

tions of motion streaks into those estimation tasks.

ACKNOWLEDGEMENTS

This work is supported by the Graduate School

Mathematical Analysis of Evolution, Information and

Complexity.

REFERENCES

Barron, J., Fleet, D., and Beauchemin, S. (1994). Perfor-

mance of optical ﬂow techniques. IJCV, pages 43–77.

Brown, L., Senior, A., Tian, Y.-L., Connell, J., Hampapur,

A., Shu, C.-F., Merkl, H., and Lu, M. (2005). Perfor-

mance evaluation of surveillance systems under vary-

ing conditions. IEEE Int’l Workshop on Performance

Evaluation of Tracking and Surveillance.

Daugman, J. (1988). Complete discrete 2-D Gabor trans-

forms by neural networks for image analysis and com-

pression. Trans. Acoustics, Speech, and Signal Proc.,

26(7):1169–1179.

Deneve, S., Latham, P., and Pouget, A. (1999). Reading

population codes: a neural implementation of ideal

observers. Nature Neuroscience, 2:740–745.

F¨orstner, W. (1986). A feature based correspondence algo-

rithm for image matching. ISP Comm. III, Rovaniemi

1986, International Archives of Photogrammetry,

pages 26–3/3.

Hubel, D. and Wiesel, T. (1968). Receptive ﬁelds and func-

tional architecture of monkey striate cortex. J. Phys-

iol., (195):215–243.

Lucas, B. and Kanade, T. (1981). An iterative image regis-

tration technique with an application to stereo vision.

Proc. DARPA Image Understanding Workshop, pages

121–130.

Majchrzak, D., Sarkar, S., Sheppard, B., and Murphy, R.

(2000). Motion detection from temporally integrated

images. In Proc. IEEE 15th ICPR, pages 836–839.

Marr, D. and Ullman, S. (1981). Direction selectivity and

its use in early visual processing. Proc. Royal Soc. of

London, B, 211:151–180.

Neumann, H. and Sepp, W. (1999). Recurrent V1-V2 in-

teraction in early visual boundary processing. Biol.

Cybernetics, 81:425–444.

Ren, Y., Chua, C.-S., and Ho, Y.-K. (2003). Motion detec-

tion with nonstationary background. Machine Vision

and Applications, 13:332–343.

Rothenstein, A. and Tsotsos, J. (2007). Attention links sens-

ing to recognition. Image and Vision Computing. (in

press).

Tian, Y.-L. and Hampapur, A. (2005). Robust salient mo-

tion detection with complex background for real-time

video surveillance. Proc. IEEE Workshop on Motion

and Video Computing, pages 30–35.

Tsotsos, J., Liu, Y., Martinze-Trujiloo, J., Pomplun, M.,

Simine, E., and Zhou, K. (2005). Attending to visual

motion. Computer Vision and Image Understanding,

100:3–40.

Weidenbacher, U., Bayerl, P., Neumann, H., and Flemming,

R. (2006). Sketching shiny surfaces: 3D shape extrac-

tion and depicting of specular surfaces. ACM Trans.

on Applied Perception, 3:262–285.

Wixson, L. and Hansen, M. (1999). Detecting salient mo-

tion by accumulating directionally-consistent ﬂow. In

Proc. of the Seventh IEEE ICCV, pages 797–804.

Zhang, W., Fang, X., Yang, X., and Wu, Q. (2007). Spa-

tiotemporal gaussian mixture model to detect moving

objects in dynamic scenes. J. of Electronic Imaging,

16.

Zhou, Q. and Aggarwal, J. (2001). Tracking and classifying

moving objects from video. In IEEE Int. Workshop on

PETS.

VISAPP 2008 - International Conference on Computer Vision Theory and Applications

650