DESIGN AND IMPLEMENTATION OF MULTI-STANDARD
AUDIO DECODER
Kong Ji, Liu Peilin
Department of Electronic Engineering, Shanghai Jiaotong University, Dongchuan Road 800#, Shanghai, China
Deng Ning, Fu Xuan, Zhang Guocheng, He Bin, Liu Qianru
Fujitsu R&D Center Co.,LTD, Shanghai, China
Department of Electronic Engineering, Shanghai Jiaotong University, Shanghai, China
Keywords: Multi-Standard, Algorithm Analysis, IMDCT, Software/Hardware Co-Design, FILTERBANK.
Abstract: In this paper, a design and implementation for Multi-Standard Audio Decoder is presented. The architecture
of the decoder is designed to support MPEG-2/MPEG-4 AAC LC Profile (ISO/IEC 13818-7 2006)
(ISO/IEC 14496-3 2006), Dolby AC-3 (ATSC 1995), Ogg Vorbis (Xiph.org Foundation 2004), Windows
Media Audio (WMA) (Microsoft 2006) and MPEG-1 Layer 3 (MP3) (ISO-IEC/JTC1 SC29 1991). Based on
the analysis of algorithms of these multi-standards, software/hardware co-design method is used to
implement the audio decoder in which a module called FILTERBANK is designed as a hardware engine.
The FILTERBANK which can support IMDCT (Inverse Modified Discrete Cosine Transform) process of
different standards is configured by CPU according to the decoded information. Compared with the
solutions of DSP/RISC or ASIC multi-standard decoders, our Multi-Standard decoder has achieved a
balance between software’s flexibility and hardware’s high efficiency. Also it meets the requirement of low
cost, low power and high audio quality. The implementation results on FPGA are given and the
performance of the decoder is evaluated.
1 INTRODUCTION
Nowadays, there are various audio compression
standards such as AAC, MP3, AC-3, Vorbis and
WMA etc. Besides, new audio standards will be
issued in the future. The question is, are today’s
audio decoders capable of processing so many
existing standards and ready for handling the new
comers?
Audio decoders are usually implemented on DSP,
microprocessor or ASIC. While a solution based on
DSP or microprocessor possesses certain flexibility,
it entails somewhat higher frequency and higher
power, which can hardly meet the increasing
demand of low power applications.
On the other hand, an ASIC solution is often
characterized by lower frequency, lower power, but
with less flexibility. Once new standards are issued
or upgrading needed, the whole system will be re-
designed.
In order to solve these problems, software
hardware co-design method is used in our Multi-
Standard decoder, which reduces the consumption of
power, area, cost, and meanwhile retaining re-
configurability and excellent performance.
Main features of the proposed Multi-Standard
Audio Decoder are as follows:
Low frequency for real-time decoding (required
clock frequency < 20MHz for AAC decoding)
Low power consumption
Compatibility for AAC, MP3, AC-3, Vorbis and
WMA
Expansibility for new standards or program
updating
Pipeline architecture between software/hardware
Multi-Standard FILTERBANK (straightforward
interface for SOC applications as an IP macro)
An architecture considering the shared arithmetic
modules and memories
Broad applications: personal audio player, set-top
box, audio system on motorcar, PDA/portable
terminals and Internet multimedia.
305
Ji K., Peilin L., Ning D., Xuan F., Guocheng Z., Bin H. and Qianru L. (2007).
DESIGN AND IMPLEMENTATION OF MULTI-STANDARD AUDIO DECODER.
In Proceedings of the Second International Conference on Signal Processing and Multimedia Applications, pages 301-304
DOI: 10.5220/0002137803010304
Copyright
c
SciTePress
This paper presents the analysis of algorithms of
various audio standards, and then introduces the
software/hardware co-design method which is used
to implement multiple audio standards into a
universal architecture. After that the architecture of
the Multi-Standard FILTERBANK and its operation
flow are presented. At the end, the implementation
results on FPGA with the performance evaluation
and the conclusion are given.
2 ALGORITHM ANALYSIS &
MULTI-STANDARD
IMPLEMENTATION
2.1 Algorithm Analysis of Multiple
Audio Standards
Most of the existing audio standards are based on
Psycho-acoustic Model to achieve high compression
ratio and ensure high audio quality. The algorithms
of main stream decoders are mainly composed of
Lossless Decoding, IQ (inverse quantization) and IT
(inverse transform).
Lossless Decoding, usually realized by VLD
(variable length decode), is used to decompress the
information which is originally compressed without
any distortions, and the most popular algorithm of
VLD is Huffman Decoding.
The IQ algorithms in various audio standards are
mostly nonlinear. For example, in AAC standard,
the IQ algorithm is realized by a nonlinear function
of 4/3 power.
The IT algorithm is used to transform spectral data
to the time domain, and in most audio standards
IMDCT is used as the realization method.
2.2 Complexity Analysis
For complexity analysis on various audio standards
(include AAC, MP3, AC-3, Vorbis and WMA),
some typical standards such as AAC, MP3 and
Vorbis are profiled to find out the key operation
modules.
The AAC LC decoder for the complexity analysis
is mainly composed of IMDCT, Huffman Decoding,
Inverse Quantization and some other modules. The
profile is based on MetaWare IDE for ARC
Version7.4.3, which simulates the ARC600 core.
The result of AAC profile is given in Figure 1,
which shows that the IMDCT module consumes
47.98% of decoding time.
Besides, Vorbis Standard is profiled on Sun Sparc
RISC CPU with a typical Vorbis Decoder referenced
by Vorbis Library. The largest time consumption
part of Vorbis decoder is IMDCT transform with
34% of total time.
Figure 1: Profile of AAC Decoder.
The conclusions from the algorithm and complexity
analysis of multiple audio standards are:
a) The filter-bank module in each standard
consumes the largest portion of decoding
time.
b) Most of the filter-banks are realized by
Radix-2 points IMDCT. (36-point or 12-
point IMDCT for MP3).
c) Other arithmetic tools are usually nonlinear
or composed of conditional branches, and
have significant distinctions among different
audio standards.
2.3 Software/Hardware Co-Design
Method for the Implementation of
Multi-Standard Decoder
The filter-bank modules of various standards are
realized by IMDCT, while other modules, which
may have significant distinctions among different
standards, usually introduce conditional branches or
de-multiplex processes. Therefore, these modules
are much more suitable to be implemented by CPU.
On the other hand, filter-bank modules of different
standards are quite similar except for the number of
points of IMDCT or the window shape. And filter-
bank module is also the key part in each standard. So
a common hardware engine implementation will be
much more efficient and convenient. As shown in
Figure 2, the decoding flows of AAC & MP3
standards are divided into software and hardware. In
hardware part, the Multi-Standard FILTERBANK is
configured to support different standards (AAC or
MP3) by parameter settings. More details about the
structure of Multi-Standard FILTERBANK will be
given in section 3.
Besides, the software on CPU and the hardware
engine are executing in pipeline. Figure 3 shows the
decoding process of MP3. After the software on
CPU decodes the first granule of a frame (including
left and right channel), CPU starts FILTERBANK
and continue to decode the second granule of this
47. 98
%
37. 54
%
1. 72
%
12. 76
%
IMDCT
Hu f f man
De c o d i n g
Inverse
Quant i zat i on
ot her s
SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications
306
frame. After software finished the second granule of
this frame, it will wait for a completion signal from
FILTERBANK to start a new frame.
SoftwareHardware
Figure 2: Hardware/Software Co-Design for AAC & MP3.
Figure 3: Pipeline of Software/Hardware for MP3.
Figure 4: Multi-Standard FILTERBANK Structure.
3 MULTI-STANDARD
FILTERBANK
3.1 The Architecture of
Multi-Standard FILTERBANK
The architecture of Multi-Standard FILTERBANK
is shown in Figure 4. In this architecture, two
different strategies, coarse granule sharing and fine
granule sharing, are used to share the resources. For
Radix-2 points IMDCT, FFT (R. Gluth 1991)
module as a coarse granule is used for IMDCT
implementation for different standards. On the other
hand, fine granule, which is to realize IMDCT with
CORDIC (Coordinate Rotation Digital Computer)
(Despain 1974), is used for 36-point or 12-point
IMDCT in MP3 standard.
The CORDIC module can calculate the FFT
coefficients and Sine Window coefficients online
without any ROM to store cosine function tables and
is free of multipliers, which is a major consumer of
resources.
3.2 Operation Flow of Multi-Standard
FILTERBANK
The Multi-Standard FILTERBANK works under
different modes to decode bit streams for different
standards. The operation modes are related to the
decoded information such as block length, which
can be different among various standards. Table 1
gives the mode to be configured by CPU. Once
started by the CPU, the Multi-Standard
FILTERBANK receives decoding information and is
re-configured to the corresponding mode.
In AAC bit stream, the block length in each
frame is constant (1024 or 128), so the
FILTERBANK is configured only at the beginning
of decoding each frame (including both left and right
channel) and sends the ‘IMDCT_Complete’ signal
to CPU after the IMDCT process of the whole frame
is done.
In AC-3 mode, FILTERBANK works similarly
to AAC mode. But in WMA or Vorbis mode, the
current block length may be different from that of
pre-block, therefore the FILTERBANK must be
configured at the beginning of each block.
Table 1: Operation Modes and Decoded Information of
Multi-Standard FILTERBNAK.
Standard
Block
Length
Configure
FILTERBANK
Window
Shape
AAC 1024/128 once/frame Sine/KBD
MP3 36/12 once/frame Sine
AC-3 256/128 once/frame KBD
Vorbis 64-8192 once/block Sine/KBD
WMA 64-2048 once/block Sine
DESIGN AND IMPLEMENTATION OF MULTI-STANDARD AUDIO DECODER
307
4 IMPLEMENTATION RESULTS
The Multi-Standard Audio Decoder is implemented
on Altera Stratix FPGA to evaluate the performance.
The size of each module is given in Table 2 on the
basis of unit of LE (logic element). These data are
the synthesis result of Quartus II Version 5.1, and
the whole audio decoder uses about 7645 LEs.
Table 2: Size of Each Block.
Module name Size/LE
CPU & BUS 2156
Bass-2 FILTERBANK Control
Units
1669
MP3 FILTERBANK Control Units 1878
Shared ALU 144
Shared CORDIC 1362
Shared FFT 436
To evaluate the performance of our Multi-
Standard Audio Decoder, it is set to the operation
mode of AAC and MP3 separately, and the
minimum real-time decoding frequency is obtained
for several sequences. Table 3 shows the average
decoding cycles for each AAC frame and the
frequency requirement for several sequences, and
Table 4 for each MP3 frame. The frequency needed
for real-time decoding is calculated by equation (1).
frequency_needed =
N
1
(cycles_per_frame×sample_frequency)
=
3MPfor1152
AACfor1024
N
(1)
5 CONCLUSIONS
A design and implementation for Multi-Standard
Audio Decoder is presented in this paper. Based on
the analysis of algorithms of these audio standards,
software/hardware co-design method is used to
implement the audio decoder, in which a module
called FILTERBANK is designed as a hardware
engine. The FILTERBANK, which supports IMDCT
process of different standards, is configured by CPU
according to the decoded information. Compared
with the solutions of pure DSP/RISC or ASIC multi-
standard decoders, our multi-standard decoder has
achieved a balance between software’s flexibility
and hardware’s high efficiency. Also it meets the
requirement of low cost, low power and high audio
quality. The presented decoder is capable of real-
time decoding for MP3/AAC/AC-3/Vorbis/WMA
standards, and is easy to be extended to new
standards or updated to new program versions.
Table 3: AAC frequency test results.
Test
AAC
sequences
Bit
rate
/kbps
Sample
rate
/kHz
Cycle
per
_frame
Frequency
Needed
/MHz
Hot 96
48 169513 8.0
Diamond 288
48 211769 9.3
Love 128
44.1 246046 10.6
Once you 224
44.1 329723 14.2
Mayday 160
44.1 290996 12.6
Liang Zhu 192
44.1 324050 14.0
Believe 320
44.1 372695 16.9
Table 4: MP3 frequency test results.
Test
MP3
sequences
Bit
rate
/kbps
Sample
rate
/kHz
Cycle
per
_frame
Frequency
Needed
/MHz
Test case 128
44.1 296205 11.3
Air 192
44.1 310800 11.9
David 192
44.1 335832 12.8
Happy 250
44.1 379191 14.5
Drum 320
44.1 421035 16.1
ACKNOWLEDGEMENTS
This work is supported by Fujitsu RD Center
Co.,LTD. and Fujitsu Laboratories LTD.
REFERENCES
ISO/IEC 13818-7, 2006, Generic Coding of Moving
Picture and Associated Audio Information C Part7:
Advanced Audio Coding
ISO/IEC 14496-3, 2006, Information technology - Coding
of audio-visual objects - Subpart 4: General Audio
Coding (GA) C AAC, TwinVQ, BSAC
ATSC, 1995, Digital Audio Compression (AC-3) Standard
Xiph.org Foundation, 2004, Vorbis I specification.
Microsoft, 2006, http://www.microsoft.com/windows.
ISO-IEC/JTC1 SC29, 1991, CD 11172-3 Coding of
Moving Pictures and Associated Audio for Digital
Storage Media at up to about 1.5 MBIT/s Part 3 Audio
Contents
R. Gluth, 1991, Regular FFT-related transform kernels for
DCT/DST-based polyphase filter banks. In ICASSP
Despain, 1974, A.M., Fourier Transform Computations
Using CORDIC Iterations, IEEE Transactions on
Computers, Vol.23,1974,pp.993-1001
SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications
308