MODELING OF REAL TIME VIDEO COMPRESSION SYSTEM

Three-dimensional Discrete Cosine Transform

Tomas Fryza

Department of Radio Electronics, Brno University of Technology, Purkynova 118, Brno, Czech Republic

Keywords:

Video signal compression, 3-D DCT, complexity, implementation, DSP TMS320C6711, C language, linear

assembler.

Abstract:

One of the methods used for the video signals’ compression is the Three Dimensional Discrete Cosine Trans-

form. The aim of this block-based method is to combine intraframe and interframe coding into a single

transform coding, therefore no motion compensation and motion prediction have to be implemented. The

paper deals with the practical ways of the 3-D DCT computing. It will be proof, the transform coding could

be used for encoding of video sequences in real time domain.

1 INTRODUCTION

A video compression method has two purposes: a) re-

duce spatial redundancy between adjacent picture el-

ements by intraframe coding and b) reduce temporal

redundancy between adjacent frames with help of in-

terframe coding. Individual compression algorithms

use different mechanisms to achieve these principles.

The 3-D DCT (Three Dimensional Discrete Cosine

Transform) is based on a consolidation of both coding

into a single transform coding where several frames

are being encoded simultaneously. The substance of

3-D DCT is similar to the JPEG standard (Wallace,

1992). Each group of N frames is divided into small

segments (so-called video cubes). From these video

cubes, the frequency coefﬁcients and output binary

stream are being formed.

The contribution presents the fundamental prop-

erties of the 3-D DCT and mainly the conditions for

utilization such a method in real time processing do-

main. The paper is divided into three parts. In Sec-

tion 2 the mathematical background of different 3-

D DCT variations is presented. The Section 3 discuss

the minimal hardware demands for real time opera-

tion and ﬁnal results are given in Section 4.

2 TRANSFORM CODING

As mentioned above in 3-D DCT compression

scheme the succession of input frames is divided into

groups of N frames. The value of N controls not only

the amount of allocated memory space for captured

samples or the number of repetition per second but

also the mathematical complexity of the encoder it-

self.

Entire encoding scheme could be split into a pre-

processing phase where the input samples could be

resampled or represented into different color space.

Next part of the encoder is the transform coding it-

self, and ﬁnally a postprocessing phase with a quan-

tizer and threshold processes. Last part compounds

of the entropy encoding and forming of the output bit

stream.

The analysis of the complexity proportion of indi-

vidual encoder parts proves the 3-D DCT transform

coding is the most consuming process of entire en-

coding stage. Thus, in the next text only the transform

coding is analyzed by reason of the estimation of real

time processing possibilities.

For video cube dimensions of N, the forward

3D DCT for a grey-scale video sequence is deﬁned

in the following way (Rao and Yip, 1990)

D

u,v,w

= γ

u

γ

v

γ

w

·

N−1

∑

x=0

N−1

∑

y=0

N−1

∑

z=0

f

x,y,z

·cos

πu(2x + 1)

2N

·

·cos

πv(2y + 1)

2N

·cos

πw(2z + 1)

2N

(1)

where D

u,v,w

represents DCT (frequency) coefﬁcient

of a picture element intensity f

x,y,z

while u, v, w =

0,1,..., N −1. The constants γ could be expressed

as follows

γ

u,v,w

= 1/

√

2 : u,v,w = 0

1 : u,v,w 6= 0.

208

Fryza T. (2008).

MODELING OF REAL TIME VIDEO COMPRESSION SYSTEM - Three-dimensional Discrete Cosine Transform.

In Proceedings of the International Conference on Signal Processing and Multimedia Applications, pages 208-211

DOI: 10.5220/0001938102080211

Copyright

c

SciTePress

The 3-D DCT deﬁnition from (1) implies high

number of arithmetical operations. Therefore, the 3-

D transform could be re-write in a form of successive

three 1-D transforms with obvious impact in reduct-

ing of mathematical operations. In the next text two

fast algorithms for evaluating of 1-D transform with

different value of N are outlined.

2.1 8-point Discrete Cosine Transform

Let N = 8, then one dimensional discrete cosine trans-

form for a single color palette is deﬁned by (2), while

u = 0, 1,. .. ,7.

D

u

= γ

u

·

7

∑

x=0

f

x

·cos

πu(2x + 1)

16

(2)

In purpose of reducing the mathematical opera-

tions, the multiplying by constant γ

u

could be com-

pound into a quantizer block, i.e. into the last part

of video encoder (post processing part). In practice,

the evaluation of 1-D DCT is obtain with help of Dis-

crete Fourier Transform and through the set of modi-

ﬁed equations (Gonzalez and Wintz, 1987):

ℜ{F

0

} = 2 ·( f

0

+ f

1

+ f

2

+ f

3

+ f

4

+ f

5

+ f

6

+ f

7

)

ℜ{F

1

} = f

0

− f

7

+C

0

·I

0

+C

1

·I

1

−C

2

·(I

1

−I

2

)

ℜ{F

2

} = f

0

− f

3

− f

4

+ f

7

+C

0

·I

3

(3)

ℜ{F

3

} = f

0

− f

7

−C

0

·I

0

−C

3

·I

2

−C

2

·(I

1

−I

2

)

ℜ{F

4

} = f

0

− f

1

− f

2

+ f

3

+ f

4

− f

5

− f

6

+ f

7

ℜ{F

5

} = f

0

− f

7

−C

0

·I

0

+C

3

·I

2

+C

2

·(I

1

−I

2

)

ℜ{F

6

} = f

0

− f

3

− f

4

+ f

7

−C

0

·I

3

ℜ{F

7

} = f

0

− f

7

+C

0

·I

0

−C

1

·I

1

+C

2

·(I

1

−I

2

)

where ℜ{F

u

} represents real part of Fourier coefﬁ-

cients and constants C

i

are deﬁned as follows

C

0

= cos(4π/16) (4)

C

1

= cos(2π/16) + cos(6π/16)

C

2

= cos(2π/16)

C

3

= cos(2π/16) −cos(6π/16)

and combinations of input samples I

i

are given by

I

0

= f

1

+ f

2

− f

5

− f

6

(5)

I

1

= f

2

+ f

3

− f

4

− f

5

I

2

= f

0

+ f

1

− f

6

− f

7

I

3

= f

0

+ f

1

− f

2

− f

3

− f

4

− f

5

+ f

6

− f

7

.

According to the (3)-(5), 5 products operations

and 29 sums operations have to be performed in or-

der to evaluate eight 1-D coefﬁcients. In every video

cube, this 1-D transform have to be repeated 192

times to obtain 512 frequency coefﬁcients. Imaging

a test grayscale video sequence with dimensions of

720 ×576 picture elements and length of 24 frames.

Therefore the minimal number of arithmetical oper-

ations for encoding such a sequence is 18,662,400

products and 108,241,920 sums (Fryza and Hanus,

2003).

2.2 4-point Discrete Cosine Transform

According to (1), the one dimensional 4-point for-

ward discrete cosine transform could be express as

follows

D

u

=

3

∑

x=0

f

x

·cos

πu(2x + 1)

8

(6)

where u = 0, 1,. ..,3. Applying the similar proce-

dures mentioned in (Gonzalez and Wintz, 1987), the

set of modiﬁed equations for fast calculations of 4-

point 1-D DCT could be evaluated. Using the exten-

sion of 4 input samples in term of f

x

= f

7−x

and using

the Discrete Fourier Transform deﬁned by (7), the 1-

D DCT could be expressed by (8).

F

u

=

7

∑

x=0

f

x

·exp(−2π ju ·x/8) (7)

3

∑

x=0

f

x

·cos

πu(2x + 1)

8

=

ℜ{F

u

}

2 ·cos(

πu

8

)

(8)

The left part of equation (8) is equal to the 1-D

DCT. Therefore, 1-D transform could be evaluated by

real parts of Fourier coefﬁcients divided by the real

constant 2 ·cos(

πu

8

). Likewise the γ values, the con-

stant could be also incorporated in quantizer block of

the encoder. Hence, the only task is to enumerate the

real parts of F

u

. It could be done by the set of equa-

tions deﬁned below

ℜ{F

0

} = 2 ·( f

0

+ f

1

+ f

2

+ f

3

) (9)

ℜ{F

1

} = f

0

− f

3

+ cos

2π

8

·( f

1

− f

3

− f

2

+ f

0

)

ℜ{F

2

} = f

0

− f

2

+ f

3

− f

1

)

ℜ{F

3

} = f

0

− f

3

−cos

2π

8

·( f

1

− f

3

− f

2

+ f

0

)

It can be seen the number of necessary arithmeti-

cal operation for 4-point 1-D DCT is 1 product and 9

sums. Nevertheless, according to the smaller blocks

of input samples encoded in one moment, the to-

tal number of operations for transforming the tested

video sequence from Subsection 2.1 (720 ×576 ×24)

is 7,464,960 products and 67,184,640 sums, i.e. lower

number.

MODELING OF REAL TIME VIDEO COMPRESSION SYSTEM - Three-dimensional Discrete Cosine Transform

209

3 COMPLEXITY VERIFICATION

In this section, the models of evaluated transform cod-

ing are being veriﬁed in order to verify the real time

processing possibilities. According to the simpliﬁca-

tion applied in Section 2, the criterion for examina-

tion of real time processing is only the calculation of

3-D DCT.

The algorithms mentioned in Subsections 2.1

and 2.2 were programed for digital signal proces-

sor TMS320C6711 from Texas Instruments in C lan-

guage and in so-called linear assembler. The linear

assembler is an interstage between high level C lan-

guage and low level assember code. The ﬂoating-

point processor TMS320C6711 contains eight func-

tional units such as hardware multiplier unit or unit

for memory accessing, 32 32-bit registers and it is

controlled by the clock signal with a relatively low

frequency of 150MHz. The basic tool for evaluating

of the algorithm velocity is a total number of CPU

cycles. Every instruction of DSP has a speciﬁc num-

ber of needed cycles and the number depends on the

type of instruction. In general the most consuming

instructions are the accessing the memory and multi-

plication in double or even in extended ﬂoating-point

precisions.

The total number of needed CPU cycles for 8-

point and 4-point 1-D DCT is shown in Table 1 and in

Table 2 respectively. The estimated time for encod-

ing the grey scaled video sequence with dimensions

of 720 ×576 ×24 are shown as well. The parameters

-o0, -o1, -o2 and -o3 correspond to the level of source

code optimizing. Parameter -o0 enables the register

level optimizing, -o1 and -o2 starts function level op-

timizing and parameter -o3 corresponds to the opti-

mizing on ﬁle level. The optimizing could be done

by the Code Composer Studio development software

from Texas Instruments.

An example of using the 4-point version of DCT

encoder for real video sequence processing is shown

in Fig. 1, where one frame from original sequence

and three details with different quality levels are pre-

sented.

Table 1: CPU cycles for 8-point 1-D DCT and duration of

transforming a grayscale video sequence (720 ×576 ×24,

f

CPU

= 150MHz).

C code ASM code

Param. Cycles Time [s] Cycles Time [s]

no opt. 57,655 22.42 10,284 4.00

-o0 52,759 20.51 10,284 4.00

-o1 26,226 10.20 5,206 2.02

-o2 15,527 6.04 2,144 0.83

-o3 15,527 6.04 2,144 0.83

Table 2: CPU cycles for 4-point 1-D DCT and duration of

transforming a grayscale video sequence (720 ×576 ×24,

f

CPU

= 150MHz).

C code ASM code

Param. Cycles Time [s] Cycles Time [s]

no opt. 5,470 17.01 1,054 3.28

-o0 4,615 14.35 1,054 3.28

-o1 2,975 9.25 702 2.18

-o2 1,583 4.92 417 1.30

-o3 1,583 4.92 417 1.30

Figure 1: A frame of tested sequence ”high jump” encoded

by 4-point encoder version with different quality levels.

It can be seen the combination of lower level pro-

gramming languages and the optimizing tools are un-

avoidable to achieved the code effective applications.

Also it can be seen the only possibility for encod-

ing a grey scaled sequence with PAL resolutions on

DSP TMS320C6711 (controlled by f

CPU

= 150MHz)

is using the 8-point fast algorithm version with maxi-

mal level of optimizing.

4 CONCLUSIONS

The contribution was focused into the video compres-

sion domain, and mainly into the modeling of the real

time 3-D DCT encoding system. This 3-D system is

used to replace two ways of video compression, i.e.

intraframe and interframe coding. In the paper two

versions of fast 1-D algorithms were outlined, for 8-

point and 4-point DCT system. It was proved the total

SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications

210

number of needed mathematical operations and mem-

ory demands declines with the selected N-point ver-

sion. The better picture quality of encoded video se-

quences could be reached by low-point DCT version

as well (Fryza and Hanus, 2004). The practical veri-

ﬁcation of the 3-D DCT calculation on ﬂoating-point

digital signal processor TMS320C6711 was also de-

scribed. The criteria was the total number of CPU

cycles. Only the high capability of the Texas Instru-

ments optimizing tools cause the 3-D DCT transform

based on 8-point 1-D DCT could be usable in real

time processing domain.

ACKNOWLEDGEMENTS

The author would like to thank for the ﬁnancial

support of the Czech Ministry of Education, Youth

and Sports under grants no. MSM 002 163 0513,

1880250/2008 and of the Czech grant agency of Sci-

ence Academy grant no. KJB 208 130 704.

REFERENCES

Fryza, T. and Hanus, S. (2003). Algorithms for fast com-

puting of the 3d dct transform. Radioengineering,

12(1):23–26.

Fryza, T. and Hanus, S. (2004). Dissimilarity detection in

video sequences using variation in time domain. In

Radioelektronika 2004 Conference Proceedings. FEI

STUBA.

Gonzalez, R. and Wintz, P. (1987). Digital Image Process-

ing. Addison Wesley Publishing Company, Boston,

2nd edition.

Rao, K. and Yip, P. (1990). Discrete Cosine Transform. Al-

gorithms, Advantages, Applications. Academic Press,

Inc., San Diego, 1nd edition.

Wallace, G. (1992). The jpeg still picture compression stan-

dard. IEEE Transactions on Consumer Electronics,

38(1):xviii–xxxiv.

MODELING OF REAL TIME VIDEO COMPRESSION SYSTEM - Three-dimensional Discrete Cosine Transform

211