A NOVEL BLOCK MOTION ESTIMATION MODEL FOR VIDEO

STABILIZATION APPLICATIONS

Harish Bhaskar and Helmut Bez

Research School of Informatics, Loughborough University

Keywords:

Video stabilization, motion compensation, motion estimation, genetic algorithms, kalman ﬁltering.

Abstract:

Video stabilization algorithms primarily aim at generating stabilized image sequences by removing unwanted

shake due to small camera movements. It is important to perform video stabilization in order to assure more

effective high level video analysis. In this paper, we propose novel motion correction schemes based on

probabilistic ﬁlters in the context of block matching motion estimation for efﬁcient video stabilization. We

present a detailed overview of the model and compare our model against other block matching schemes on

several real-time and synthetic data sets.

1 INTRODUCTION

Video data obtained from compact motion capture de-

vices such as hand-held, head mounted cameras, etc.

has gained signiﬁcant attention in recent years. Video

stabilization, as the name suggests, deals with gen-

erating stabilized video sequences by removing un-

wanted shakes and camera motion. Several meth-

ods have been proposed in the literature for ac-

complishing video stabilization. However, the ac-

curacy of motion estimation is a key to the per-

formance of video stabilization. (Y.Matsushita and

H.Y.Shum, 2005) propose a combination of motion

inpainting and deblurring techniques to accomplish

robust video stabilization. Several other research

contributions have been made to video stabiliza-

tion including, probabilistic methods (A.Litvin and

W.C.Karl, 2003), model based methods, etc. Meth-

ods such as (M.Hansen and K.Dana, 1994)(Y.Yao and

R.Chellappa, 1995)(P.Pochec, 1995)(J.Tucker and

Lazaro, 1993)(K.Uomori and Y.Kitamura, 1990), pro-

pose to combine global motion estimation with ﬁlter-

ing to remove motion artifacts from video sequences.

These schemes perform efﬁciently only under re-

stricted conditions and are again limited by the efﬁ-

ciency of the global motion estimation methodology.

(K.Ratakonda, 1998) have used an integral match-

ing mechanism for compensating movement between

frames. (T.Chen, 2000) propose a 3 stage video sta-

bilization algorithm based on motion estimation. The

process includes motion estimation for computing lo-

cal and global motion parameters, motion smoothing

for removing abrupt motion changes between sub-

sequent frame pairs and ﬁnally a motion correction

methodology for stabilization. In this paper we ex-

tend the work presented in (T.Chen, 2000) to accom-

modate a novel motion correction mechanism based

on moving average ﬁlters and Kalman ﬁltering along-

side a motion estimation strategy that combines vec-

tor quantization based block partitioning with a ge-

netic algorithm based block search for motion esti-

mation.

2 PROPOSED MODEL

The video stabilization model proposed in this pa-

per extends a parametric motion model proposed in

(T.Chen, 2000). A detailed overview of the proposed

model in the form of a pseudo code is as follows.

• Input at a time instant t two successive frame pairs

of a video sequence, f

t

& f

t+1

where 1 ≤ t ≤ N ,

where N is total number of frames in the video

• Image frame f

t

is initially partitioned into 4

blocks using the vector quantization algorithm

303

Bhaskar H. and Bez H. (2007).

A NOVEL BLOCK MOTION ESTIMATION MODEL FOR VIDEO STABILIZATION APPLICATIONS.

In Proceedings of the Fourth International Conference on Informatics in Control, Automation and Robotics, pages 303-306

DOI: 10.5220/0001650803030306

Copyright

c

SciTePress

described in the subsection below, Note: Every

block represents an image region

• For every block b

– The centroid (x

c

, y

c

) of the block is computed

– A genetic algorithm as described below is used

to accurately match the block in the successive

frame f

t+1

– If the genetic algorithm accurately matched the

block in frame f

t

to frame f

t+1

(with error

= 0), then the motion vector is evaluated as

(x

∗

−x, y

∗

−y) where (x

∗

, y

∗

) is the estimated

transformed centroid of the block in frame f

t+1

– If the genetic algorithm returned non-zero

matching error then the process is repeated by

further sub dividing block.

• The process is terminated either when no further

splitting is needed or a predeﬁned block size is

reached.

• If the processed frame pair is (f

t

, f

t+1

) where t =

1, then proceed to next frame pair, otherwise if

t > 1, then run motion correction using any of the

proposed ﬁlter mechanisms speciﬁed to generate

smoothed motion vectors MV

ℵ

• Compute the difference between the original mo-

tion vectors MV and the smoothed motion vec-

tors M V

ℵ

adjust the original motion vectors by

the factor of difference MV

comp

= MV ±(M V −

MV

ℵ

)

• Generate Stabilized frames using the original mo-

tion vector MV and compensated motion vectors

MV

comp

and represent them as f

∗

t+1

and f

∗comp

t+1

• Deduce the PSNR of the two versions of stabi-

lized frames using, PSNR for a gray scale image

is deﬁned as:

10 log

10

"

255

2

1

HW

P

H

P

W

kf

t+1

− f

comp

k

2

#

(1)

where, (H, W ) is the dimensionality of the frames

and f

t+1

and f

c

omp are the intensity components

of the original target and the motion compensated

images which will equal f

∗

t+1

and f

∗comp

t+1

re-

spectively. PSNR values generally range between

20dB and 40dB; higher values of PSNR indicate

better quality of motion estimation.

• If P SNR

comp

≥ P SNR then use f

∗comp

t+1

as sta-

bilized frame for subsequent analysis otherwise

use f

∗

t+1

.

2.1 Motion Estimation

A brief description of the algorithms is speciﬁed.

2.1.1 Block Partitioning Based on Vector

Quantization

For the block partitioning phase, we start by using

vector quantization to provide the block matching

scheme with the position of partitioning.

• Set the number of codewords, or size of the code-

book to 4. This assumes that we need 4 regions to

emerge out of the image frame during the quanti-

zation process.

• Initialize the positions of the codewords to

(

w

4

,

h

4

), (

w

4

,

3h

4

), (

3w

4

,

3h

4

), (

3w

4

,

3h

4

) where (w, h)

is the width and height of the block respectively.

By this we assume that the worst case partition

could be the quad-tree partition.

• Determine the distance of every pixel from the

codewords using a speciﬁc criterion. The distance

measure is the sum of differences in the gray in-

tensity and the locations of the pixels.

• Group pixels that have the least distance to their

respective codewords.

• Iterate the process again by recomputing the code-

word as the average of each codeword group

(class). If m is the number of vectors in each class

then,

CW =

1

m

m

X

j=1

x

j

(2)

• Repeat until either the codewords don’t change or

the change in the codewords is small

• Associated with these 4 codewords, there are 4

conﬁgurations possible for partitioning the image

frame into blocks. The conﬁgurations arise if we

assume one square block per conﬁguration. It is

logical thereafter to ﬁnd the best conﬁguration as

the center of mass of these 4 possible conﬁgura-

tions. The center of mass will now be the partition

that splits the image frame into blocks.

2.1.2 Genetic Algorithm Search

The inputs to the genetic algorithm are the block b

t

and the centroid (x

c

, y

c

) of the block.

• Population Initialization: A population P of these

n chromosomes representing (T

x

, T

y

, θ) is gener-

ated from uniformly distributed random numbers

where,

– 1 ≤ n ≤ limit and limit (100) is the maxi-

mum size of the population that is user deﬁned.

• To evaluate the ﬁtness E(n) for every chromo-

some n:

ICINCO 2007 - International Conference on Informatics in Control, Automation and Robotics

304

– Extract the pixels locations corresponding to

the block from frame f

t

using the centroid

(x

c

, y

c

) and block size information

– Afﬁne Transforming these pixels using the

translation parameters (T

x

, T

y

) and rotation an-

gle θ using,

x

′

y

′

1

=

cosθ −sinθ 0

sinθ cosθ 0

0 0 1

1 0 T

x

0 1 T

y

0 0 1

x

y

1

– If b

t

represents the original block under consid-

eration, b

∗

t+1

represents the block identiﬁed at

the destination frame after transformation and

(h, w) the dimensions of the block, then the ﬁt-

ness E can be measured as the mean absolute

difference (MAD).

MAD =

1

hw

h

X

i=1

w

X

j=1

b

t

(i, j) − b

∗

t+1

(i, j)

(3)

• Optimization: Determine the chromosome with

minimum error n

emin

= n where E is mini-

mum. As this represents a pixel in the block,

determine all the neighbors (N H

k

) of the pixel,

where 1 ≤ k ≤ 8.

– For all k, determine the error of matching as in

Fitness evaluation.

– If E(NH

k

) < E(n

emin

), then n

emin

= NH

k

• Selection: Deﬁne selection probabilities to select

chromosomes for mutation or cloning. Perform

cross-over and mutation operations by swapping

random genes and using uniform random values.

• Termination: Three termination criterion such as

zero error, maximum generations and stall gener-

ations. Check if any condition is satisﬁed, other-

wise iterate until termination.

2.2 Motion Smoothing

The work of (T.Chen, 2000) suggested the use of a

moving average low pass ﬁlter for this process. In

this paper, we extend the moving average ﬁlter to an

exponentially weighted moving average ﬁlter.

2.2.1 Exponentially Weighted Moving Average

Filter

A detailed pseudo code describing the process is as

follows.

• Set the number of frame pairs across which the

moving average ﬁlter to be any scalar J

• Compute the parameter alpha ∝ = (1 ÷ J)

• Compute the weighting factors for every frame

pair between 1 and J as w =∝

i−1

×(1− ∝),

where, 1 ≤ i ≤ J (Use these weighting factors as

a kernel for the convolution process)

• Generate a vector of the motion vectors and rota-

tion parameter theta across all frames; MV and

θ

• Perform Convolution to generate the smoothed

motion vectors, MV

ℵ

= MV ⊗w and θ

ℵ

= θ⊗w

2.2.2 Kalman Filter

A 2D Kalman ﬁlter can be used to predict motion vec-

tor of successive frames given the observation or mo-

tion vectors of the previous frames. An algorithm de-

scribing the smoothing process is listed below.

• Initialize the state of the system using

(x, y, dx, dy), where (x, y) is the observa-

tion (i.e. the centroid of the block) and (dx, dy) is

the displacement of the centroids. The values of

state can be initialized using the motion estimates

between the ﬁrst successive frame pair.

• The state of the system S at time instant t + 1 and

the observation M at time t can be modeled using

S(t + 1) = AS(t) + N oise(Q) (4)

M(t) = S(t) + Noise(R) (5)

• Initialize A and noises Q, R as Gaussian.

• Perform the predict and update steps of standard

Kalman ﬁlter

– Initialize state at time instant t

0

using

S

0

= B

−1

M

0

and error covariance U

0

=

∈ 0

0 ∈

– Iterate between the predict and update steps

– Predict: Estimate the state at time instant t + 1

using S

−

k

= AS

k−1

and measure the predicted

error covariance as U

−

k

= AU

k−1

A

T

+ Q

– Update: Update the correct, state of the system

S

k

= S

−

k

+ K(M

k

− BS

−

k

) and error covari-

ance as U

k

= (I − KB)U

−

– Compute K, the Kalman gain using K =

U

−

k

B

T

(BU

−

k

B

T

+ R)

−1

• Smooth the estimates of the Kalman ﬁlter and

present the smoothed outcomes as MV

ℵ

3 RESULTS AND DISCUSSION

Here, in this section, we present some sample results

of the stabilization task on wildlife videos taken at

a zoological park. Performance of the video stabi-

lization scheme can only be visually evaluated. We

A NOVEL BLOCK MOTION ESTIMATION MODEL FOR VIDEO STABILIZATION APPLICATIONS

305

provide some sample frames illustrating the quality

of video stabilization. Figure 1 compare the video

stabilization quality of the base-line model versus the

proposed model. As we can clearly visualize there

is quite a increased quality in the stabilized version

of the proposed model in comparison to the stabi-

lized version of the base model. The motion correc-

tion scheme using the Kalman ﬁlter was sufﬁcient to

smooth the motion vector correctly. The reason to this

is because, the changes observed in the capture was

linear. Similarly in ﬁgures 2, we compare the quality

of video stabilization using another sample clip from

the same wildlife video. The movement of the cam-

era in this sequence was more abrupt and random in

directions. We observed that the proposed model us-

ing Kalman ﬁlter could not handle these changes well

and as well generate a good quality stabilized output.

However, the motion correction mechanism using the

exponentially weighted moving average ﬁlter could

produce much better results.

Baseline Model

Unstabilized Frame Stabilized Frame

Proposed Model

Unstabilized Frame Stabilized Frame

Figure 1: Model Performances on Video Sample Clip 3.

4 CONCLUSION

In this paper, we have presented a novel mechanism

of motion correction and block based motion estima-

tion strategy that combines vector quantization based

block partitioning mechanism with the genetic algo-

rithm based block search scheme applied to video sta-

bilization. The model was tested on several real time

datasets and the results have revealed a high degree

of performance improvement when compared to ex-

isting video stabilization model based on motion esti-

mation and ﬁltering.

Baseline Model

Unstabilized Frame Stabilized Frame

Proposed Model

Unstabilized Frame Stabilized Frame

Figure 2: Model Performance on Video Sample Clip 6.

REFERENCES

A.Litvin, J. and W.C.Karl (2003). Probabilistic video stabi-

lization using kalman ﬁltering and mosaicking. Image

and Video Communication and Processing.

J.Tucker and Lazaro, A. S. (1993). Image stabilization for a

camera on a moving platform. In IEEE Conference on

Communications, Computers and Signal Processing,

pages 734 – 737.

K.Ratakonda (1998). Real-time digital video stabilization

for multi-media applications. In Proceeding of the

1998 IEEE International Symposium on Circuits and

Systems, pages 69–72.

K.Uomori, A.Morimura, H. T. and Y.Kitamura (1990). Au-

tomatic image stabilizing system by full-digital signal

processing. In IEEE Transactions on Consumer Elec-

tronics, pages 510–519.

M.Hansen, P. and K.Dana (1994). Real-time scene stabi-

lization and mosaic construction. Image Understand-

ing Workshop Proceedings.

P.Pochec (1995). Moire based stereo matching technique.

In Proceedings of ICIP, pages 370–373.

T.Chen (2000). Video stabilization algorithm using a block

based parametric motion model. Master’s thesis.

Y.Matsushita, E.Ofek, X. and H.Y.Shum (2005). Full-frame

video stabilization. In IEEE International Conference

on Computer Vision and Pattern Recognition.

Y.Yao, P. and R.Chellappa (1995). Electronic image stabi-

lization using multiple image cues. In Proceedings of

ICIP, pages 191–194.

ICINCO 2007 - International Conference on Informatics in Control, Automation and Robotics

306