AN IMAGE PROCESSING ALGORITHM

Saving valuable time in a sequence of frames analysis

E. Karvelas, D. Doussis and K. Hrissagis

Zenon S.A., Kanari 5, 15354 Glyka Nera, Greece

Keywords: image processing, motion detection.

Abstract: This paper describes a new algorithm to detect moving objects in a dynamic scene based on statistical

analysis of the greyscale variations on a sequence of frames which have been taken in a time period. The

main goal of the algorithm is to identify changes (e.g. motion) while coping with variations on

environmental changing conditions without being necessary to perform a prior training procedure. In this

way, we use a pixel level comparison of subsequent frames in order to deal with temporal stability and fast

changes. In addition, this method computes the temporal changes in the video sequence by incorporating

statistical results and it is less sensitive to noise. The algorithm’s goal is not to detect motion but rather to

filter out similar frames in a sequence of frames, thus making it a valuable tool for those who would like to

evaluate and analyze visual information obtained from a captured video frames. Finally, experimental

results and a performance measure establishing the confidence of the method are presented.

1 DESCRIPTION OF THE

ALGORITHM

The developed algorithm identifies those grayscale

frames with different content form the immediate

previous ones in a sequence of frames. The

algorithm has been tested with video frames with

rate 1 frame/ second.

The algorithm marks every frame as hidden or

shown. After M frames there is a number of

shown

frames marked as shown and a number of

hidden

frames marked as hidden, where

hiddenshown

. The

shown

frames

include the

shown

True

M as the correctly marked frames

and the

shown

False

M as mistakenly marked ones. Thus,

we have the following equation

shown

False

shown

True

shown

MMM += . The same apply to

the hidden frames, i.e.

hidden

False

hidden

True

hidden

MMM += .

The algorithm assigns different significance to

the

hidden

False

M ,

hidden

False

M frames. In particular, it is

allowable for the algorithm to show frames which

should be hidden while it is not acceptable to hide

frames which should be shown. This algorithm is

useful in projects in which it is desirable to eliminate

the

hidden

False

M frames while it is acceptable to keep the

shown

False

M frames. The

shown

False

M frames although are

not critically important, they affect the efficiency of

the image processing.

In the implementation of the algorithm we used

videos with the frame rates of 1 frame/ second. In

order to test the algorithm we used the following

different sets of captured frames:

– Set A: It included frames, which exhibit no

motion and no difference in the light

condition.

– Set B: It included frames, which exhibit

differences in less than 10% of their contents

and no difference in the light condition.

– Set C: It included frames, which exhibit

differences in more than 10% of their

contents and no difference in the light

condition.

– Set D: It included frames, which exhibit no

motion and difference due to illumination

variance.

The developed algorithm has two stages of

image analysis. During the first stage the algorithm

searches for differences between the two frames on a

pixel level and marks blocks of

99× pixels that

have significant changes.

232

Karvelas E., Doussis D. and Hrissagis K. (2005).

AN IMAGE PROCESSING ALGORITHM - Saving valuable time in a sequence of frames analysis.

In Proceedings of the Second International Conference on Informatics in Control, Automation and Robotics - Robotics and Automation, pages 232-236

DOI: 10.5220/0001170502320236

 SciTePress

During the second stage the algorithm uses the

patterns of differences, which were found in the first

stage in order to decide if it will show or hide the

frame.

1.1 First Stage Analysis

In order to eliminate the noise the algorithm

converts the frame from the initial array of

wh×

11×

pixel in an array of

(

)

(

)

33×

pixel .

The grayscale value of the

33×

pixel of the

position (where i is the row and j is the column) is

given by:

∑∑

−=

××

1133

nmij

pixelpixel , where

pixel

11×

the grayscale value of the original pixel at

position.

Thus, during the first stage of the image analysis the

algorithm divides the two compared frames to form

a grid

(

)

(

)

which is consisted of areas of

3x3

33×

pixel .

The corresponding areas between the two

consecutive frames are compared as shown in the

following equations so that the differences between

them, which are not attributed to illumination

variations, can be quantized. For each set of

compared frames we have the following variables:

eSecondFram

FirstFrame

normal

pixel

⎥

⎦

⎤

⎢

⎣

⎡

−

⎥

⎦

⎤

⎢

⎣

⎡

∑∑

−=

eSecondFram

FirstFrame

negative

pixelS

⎥

⎦

⎤

⎢

⎣

⎡

−

⎥

⎦

⎤

⎢

⎣

⎡

−

∑∑

−=

33max

)(

The

max

S value is equal to the maximum value

which the variable

33×

pixel can take. Taking into

that an 8-bit grayscale video is used, this value is

equal to:

25533

max

××=S .

The values of the variable

normal

D as well as the

negative

D versus the value of a

[] []

2/)(

3333

eSecondFram

FirstFrame

pixelpixelpixel

××

+≡

are shown in Figure 1 for set A frames and in Figure

2 for set B frames. The variable D

is defined as

⎪

⎭

⎪

⎬

⎫

⎪

⎩

⎪

⎨

⎧

≥

≡

max

SpixelD

negative

normal

Figure 1 and Figure 2 reveal that we can

differentiate between the set A and set B frames

when we compare the “bright” areas

(

max

Spixel

≥

) and the negatives of the

dark” areas (

max

Spixel

AN IMAGE PROCESSING ALGORITHM - Saving valuable time in a sequence of frames analysis

233

Figure 1: The values of the variable

normal

D as well as the

negative

D versus the value of a

pixel

33×

for the set A frames

Figure 2: The values of the variable

normal

D as well as the

negative

D versus the value of a

pixel

33×

for the set B frames

ICINCO 2005 - ROBOTICS AND AUTOMATION

234

1.2 Second Stage Analysis

Next a further analysis of the identified as

“different” areas is necessary in order to eliminate

the noise factor and identify human motion. In

particular every “different” area is analyzed by

taking into consideration the behaviour of its

surrounding areas. Thus, the notion of the “clusters”

is introduced. The concept of the clusters derives

from the fact that human motion presents a relatively

sizable motion. During the second stage of the visual

analysis not only the differences between the

compared frames is examined but also the relation of

these differences with the “neighbour” surrounding

differences. In particular we statistically analyze the

clusters of differences which are formed. The size of

these clusters varies along with their distribution for

the four cases which we consider in this paper.

Analysis of the distribution results in the graph

shown in Figure 3.

As it is shown in Figure 3 we can distinguish

between the three cases of motion and ambient light.

These different distributions help us to establish

additional criteria for identifying human “motion” in

the sequence of frames which was our main target in

the study case analysis of the developed method.

1.3 Results

In order to test the algorithm we have used two

different sets of data. In the first one the data was

captured from 6 different cameras which were

placed inside a collapsing building. In the second set

the data was captured during building evacuation.

The results are shown in the Table 1.

2 CONCLUSION

Image processing has typically focused on only

accuracy or only speed. This algorithm represents a

good compromise between speed and accuracy. The

method is also very robust in presence of noise. It

yielded reasonable results for fairly low signal to

noise levels. In addition it does not require any

training procedure. The developed algorithm is a

valuable tool to the hands of those who want to

process a vast number of frames which have been

captured with a time difference of one second or

more and would like to focus only on those frames

which provide useful information

ACKNOWLEDGMENT

This study was partially funded by the European

Commission under the contract No IST-2000-29401-

Project LOCCATEC.

0,00

0,10

0,20

0,30

0,40

0,50

0,60

0,70

0,80

123456789

No of neighborhoods

Persentage

Set A

Set D

Set B

Figure 3: The distribution of the "motion" detection clusters

Table 1: The results of the implementation of the developed algorithm

LCD set

hidden

False

hidden

True

shown

False

shown

True

Collapsed building 0 509 3 120 632

Evacuated building 0 67 2 261 330

AN IMAGE PROCESSING ALGORITHM - Saving valuable time in a sequence of frames analysis

235

REFERENCES

LOCCATEC Low Cost Catastrophic Event Capturing, ,

Contract No IST-2000-29401

Y. Amit and A. Kong. Graphical templates for model

registration. IEEE Transactions on Pattern Analysis

and Machine Intelligence, 18:225–236, 1996.

A. Blake and M. Isard. 3d position, attitude and shape

input using video tracking of hands and lips. In Proc.

ACM Siggraph, pages 185–192, 1994.

C. Bregler and J. Malik. Tracking people with twists and

exponential maps. In Proc. IEEE CVPR, pages 8–15,

1998.

C. Chow and C. Liu. Approximating discrete probability

distributions with dependence trees. IEEE

Transactions on Information Theory, 14:462–467,

1968.

T. Cover and J. Thomas. Elements of Information Theory.

John Wiley and Sons, 1991.

N. Friedman and M. Goldszmidt. Learning bayesian

networks from data. Technical report, AAAI 1998

Tutorial,

http://robotics.stanford.edu/people/nir/tutorial/, 1998.

D. Gavrila. The visual analysis of human movement: A

survey. Computer Vision and Image Understanding,

73:82–98, 1999.

L. Goncalves, E. D. Bernardo, E. Ursella, and P. Perona.

Monocular tracking of the human arm in 3d. In Proc.

5th Int. Conf. Computer Vision, pages 764–770,

Cambridge, Mass, June 1995.

I. Haritaoglu, D. Harwood, and L. Davis. Who, when,

where, what: A real time system for detecting and

tracking people. In Proceedings of the Third Face and

Gesture Recognition Conference, pages 222–227,

1998.

S. Ioffe and D. Forsyth. Human tracking with mixtures of

trees. In International Conference on Computer

Vision, pages 690–695, July 2001.

M. Jordan, editor. Learning in Graphical Models. MIT

Press, 1999.

M. Meila and M. Jordan. Learning with mixtures of trees.

Journal of Machine Learning Rearch, 1:1–48, 2000.

R. Polana and R. Nelson. Detecting activities. In

DARPA93, pages 569–574, 1993.

J. Rehg and T. Kanade. Digiteyes: Vision-based hand

tracking for human-computer interaction. In

Proceedings of the workshop on Motion of Non-Rigid

and Articulated Bodies, pages 16–24, November 1994.

K. Rohr. Incremental recognition of pedestrians from

image sequences. In Proc. IEEE Conf. Computer

Vision and Pattern Recognition, pages 8–13, New

York City, June, 1993.

Y. Song, X. Feng, and P. Perona. Towards detection of

human motion. In Proc. IEEE CVPR 2000, volume 1,

pages 810–817, June 2000.

Y. Song, L. Goncalves, E. D. Bernardo, and P. Perona.

Monocular perception of biological motion in

johansson displays. Computer Vision and Image

Understanding, 81:303–327, 2001.

C. Tomasi and T. Kanade. Detection and tracking of point

features. Tech. Rep. CMU-CS-91-132,Carnegie

Mellon University,1991.

S.Wachter and H.-H. Nagel. Tracking persons in

monocular image sequences. Computer Vision and

Image Understanding, 74:174–192, 1999.

M. Weber, M. Welling, and P. Perona. Unsupervised

learning of models for recognition. In Proc. ECCV,

volume 1, pages 18–32, June/July 2000.

Y. Yacoob and M. Black. Parameterized modeling and

recognition of activites. Computer Vision and Image

Understanding, 73:232–247, 1999.

Bock, P., The Emergence of Artificial Cognition: an

Introduction to Collective Learning, Singapore: World

Scientific, 1993

Bock, P., Klinnert, R., Kober, R., Rovner, R. and Schmidt,

H. “Gray Scale ALIAS”, IEEE Special Trans.

Knowledge and Data Eng., vol 4, no 2, Apr 1992

Haynes, S.M., and Jain, R. “Detection of Moving Edges,”

Computer Vision, Graphics and Image Processing, vol

21, no 3, Mar 1982

Horn, B.K.P., and Schunck, B.G. “Determining Optical

Flow,” Artificial Intelligence, vol 17, no 1-3, Aug

1981

Howard, C.G. and Kober, R. “Anomaly Detection in

Video Images”, Proceedings of the Fifth Neuro-Nimes

Conference: Neural Networks and their Applications,

Nimes, France, Nov 1992

Hubshman, J. and Achikian, M. “Detection of Targets in

Terrain Images with ALIAS”, Proc. Twenty-Third

Annual Pittsburgh Conf. on Modeling and Simulation,

Apr 1992

Kober, R., Bock, P., Howard, C., Klinnert R., and

Schmidt, H. (1992). “A Parallel Approach to Signal

Analysis”, Neural Network World, vol 2, no 6, Dec

1992

Nagel, H. “Displacement Vectors Derived from Second-

Order Intensity Variations in Image Sequences,”

Computer Vision, Graphics, and Image Processing,

vol 21, no 1, Jan 1983

Schalkoff, R.J., and McVey, E.S. “A Model and Tracking

Algorithm for a Class of Video Targets,” IEEE Trans.

on Pattern Analysis and Machine Intelligence, vol 4,

no 1, pps 2-10, Jan 1982

Schmidt, H., and Bock, P. “Traffic Jam Detection Using

ALISA”, Proc. of the IX Int. Symposium on Artificial

Intelligence, Nov 1996

ICINCO 2005 - ROBOTICS AND AUTOMATION

236