I. Abdeljaouad, T. Rachidi and H. Bouzekri
Al Akhawayn University in Ifrane, Morocco
Keywords: Cross layer Optimization, HSDPA, MPEG4, HARQ.
Abstract: A novel cross layer optimization technique for efficient streaming of MPEG4 VIDEO over a High Speed
Downlink Packet Access (HSDPA) network is proposed in this paper. The proposed technique uses the
types of frames produced by the MPEG encoder to optimize the performance of the Hybrid Automatic
Repeat reQuest (H-ARQ) protocol at the MAC layer. Our aim is to reduce the total power at the NodeB, and
to increase the overall system throughput, while maintaining satisfactory user-perceived Quality of Service
(QoS). The proposed technique consists in applying ARQ retransmission for MPEG4 I-frames (the most
critical frame of an MPEG4 stream) upon the reception of a negative acknowledgment (NACK) message
from the receiver (UE). Packet combining is then performed with the aid of the available I-frames at the
receiver side. Different packet combining strategies have been investigated to assess the performance of the
proposed cross-layer technique. We show that compared to the blind HARQ Chase Combining scheme
applied indiscriminately to all MPEG4 frames, our scheme allows for saving up to 11% of the power at the
NodeB, and up to 10% of the system bandwidth, while ensuring satisfactory video quality to users.
HSDPA (also known as 3.5G and as WCDMA
Release 5) is a new release of UMTS networks. Its
new downlink transport channel HS-DSCH (High
Speed Downlink Shared Channel) provides greater
capacity –up to several Mbps-, as well as increases
the wireless system performance by supporting new
features. Among these features are: fast link
adaptation, Adaptive Modulation and Coding (AMC
- changing modulation and coding format according
to channel conditions), fast scheduling, and Hybrid
Automatic Repeat request (HARQ). The new
transport channel (HS-DSCH) uses available radio
frequency resources efficiently by sharing multiple
access codes, transmission power, and use of
infrastructure hardware between users.
AMC provides the possibility to match
modulation-coding scheme to the channel conditions
for each user. The power of the retransmitted signal
is kept constant over a frame interval while the
modulation and coding format changes to match the
current received signal quality.
Contrary to UMTS Rel’99, in HSDPA the
scheduler has been moved from the RNC (Radio
Network Controller) to the NodeB. Scheduling is
done based on information about the channel quality,
terminal capability, QoS class and power/code
availability. It is fast because it is close to the air
interface and a shorter frame length is used. The
most famous and widely used scheduling algorithms
are Round Robin (RR), Maximum Carrier-to
Interference ratio (MAX-CI) and the Proportional
Fairness (PF).
Moreover, HARQ handles retransmissions
requested by UEs (User Equipments) due to errors in
the radio packets. These requests are processed in
the current WCDMA networks by the RNC while in
HSDPA, they are processed at the NodeB to provide
the fastest response possible. HARQ has two
schemes: Chase Combining (CC) and Incremental
Redundancy (IR). CC keeps the erroneous packet,
and requests that the exact same packet be
retransmitted. Upon receipt of this latter, it uses soft-
combining to combine the erroneous and
retransmitted packets to increase the possibility of
successful decoding. IR, on the other hand,
retransmits the same packet but differently coded.
The receiver selects correctly transmitted bits from
the original transmission and the retransmission to
minimize the need for further repeat requests when
multiple errors occur in transmitted signals.
Abdeljaouad I., Rachidi T. and Bouzekri H. (2008).
In Proceedings of the International Conference on Wireless Information Networks and Systems, pages 74-77
DOI: 10.5220/0002022800740077
CC is simple to implement, while IR is more
powerful but complex, eventually adding extra delay
in the decoding.
Yet, despite the high data rates and the
previously cited improvements offered by HSDPA
over UMTS Rel’99, its shared medium represents
still a challenge for the provisioning of QoS
(guaranteed bandwidth, delay and jitter) for delay
and/or error-sensitive applications such as MPEG4
video streaming applications. And although, the
radio protocol stack at the NodeB is designed to
operate under worst condition scenarios, it remains
generic, and does not factor in specific application
requirements (such as the differentiation in the
transmission/retransmission schemes to be used for
various application data/frames), yielding ineffective
use of spectrum.
For achieving optimal decision, and therefore
yielding efficient transmission subsystem, the
different layers of the end-2-end protocol stack need
to cooperate and exchange. Sharing knowledge/data
types among the different protocol layers (which is
the main idea behind Cross Layer Optimization -
CLO) helps achieve a higher adaptability to the
changing network conditions although this is
violating strict layering design rules.
Recognising the importance of CLO when streaming
MPEG4 video over wireless networks (and best
effort networks in general), many researchers have
looked into how the availability of application layer
information across the layers up until the MAC layer
can help achieve better performance. For instance,
the main idea explored in (Ahmed et. al., 2003) is to
add a cognitive layer able to change transport
parameters, bit rates and QoS mechanisms based on
the network conditions. Therein the proposed
architecture takes into consideration the
characteristics of MPEG4 and IP Diffserv to propose
techniques for media content analysis and network
control mechanisms for adaptive video streaming
over IP networks. In (Zheng, 2003) Zheng studies
the effect of the scheduler (MAX and PF) as well as
the error detection/protection techniques (HARQ
fitted mapping and 1% FER based mapping) on QoS
parameters for the case of streaming MPEG4 over
HSDPA. He also compares the performance of UDP
and UDP-Lite for streaming MPEG4 over UMTS-
like systems. In (Chen et. al., 2003), three techniques
are presented to tackle the changing conditions of
the wireless medium for multimedia delivery. These
are: swift-OFDM, low-latency packet-awareness
coder and adaptive noise filtering. Last, (Yufeng et.
al., 2002) proposes a set of end-to-end application
layer techniques for adaptive video streaming over
wireless networks. These techniques are:
Application layer packetization scheme, Class based
unequal error protection and finally a Priority based
ARQ scheme.
In wireless networks, video streaming
applications suffer the most from delay and jitter that
are introduced mainly by retransmitting erroneous or
lost packets. As for errors, such applications use
error concealment techniques to compensate for any
erroneous video frame. Also, we know that HSDPA
is known for delivering better QoS in terms of delay
and jitter values, as well as for its strong
retransmission strategy, namely HARQ, thanks to
the new added. Still, using HARQ will result in more
delay and jitter but better quality.
In this paper, we propose a new scheme based on
cross layer optimization for streaming MPEG4 video
over HSDPA. Our adaptive scheme uses interaction
between both the link and application layers (the link
layer being the one that knows about the changing
network conditions and the application layer being
the one that knows about the type of video frame) to
take a retransmission strategy based on the type of
video frame being retransmitted (I, P or B). To our
best knowledge, there has not been any published
research that combines HARQ retransmission
strategy with the importance of the video frame
being retransmitted over HSDPA.
The remaining of the paper is organized as
follows: section 3 presents the technique and the
underlying assumptions. Section 4 describes the
simulation setup, while the results are presented in
section 5.
In this paper, we make information normally
available to the application layer (which is the type
of video frame) accessible to the MAC layer so that
this latter makes a retransmit decision based on the
type of the video frame. When the MAC-HS entity
receives an erroneous frame, it checks its type before
requesting a retransmission. If the frame is of type I,
it requests the retransmission. However, if the frame
is of any other type (P or B), it just discards it and
does not request retransmission. I-frames are the
ones that carry much information and that P (and B)
frames depend on the previous (and following) I-
frames for successful decoding. We also know that I-
frames are the ones that achieve the least
compression ratio while other types of frames
achieve the best compression ratio. Thus, if an I-
frame is lost or erroneously received and if we
choose to discard it, all successive P and successive
and previous B frames depending on this I-frame
will fail decoding and thus we would have lost
bandwidth and power by transmitting them because
they will be discarded at the receiver anyway. This is
why our scheme favors I-frames over the other types
of frames.
Standard video streaming applications use UDP,
RTP and RTCP as transport protocols. RTP runs on
top of UDP, packetizes and provides in-order
delivery of video frames. RTCP, when used,
operates as closed loop control mechanism for
informing the video source of the received video
quality. During the simulations, we assume no
interaction between the video client and server.
Thus, RTCP is not modeled. However, those
functions needed such as packetization, packet
sequence numbering and in-order delivery are
supported by the different tools in Evalvid (Klaue et.
al., 2007). For example, packetization is
implemented by the Video Sender at Evalvid.
Without loss of pertinence, we use MAX-CI as a
scheduler, because it serves users with good channel
quality increasing system throughput and providing
better QoS. We also use CC as our HARQ scheme to
minimize delays.
Also, since the primary goal of the simulation is
to investigate the impact of ARQ/HARQ schemes on
the quality of the MPEG4 video, we assume no
packet losses, errors or congestion occurring in
either the Internet or the UMTS core network. This
is a fair assumption when compared to air interface
generated errors. Moreover, we assume that ACKs
and NACKs coming from UEs to the BS do not
undergo any losses or errors. The delay introduced
by the Internet and UMTS core network is kept
constant and low throughout the simulation time.
Each link capacity was chosen so that the radio
channel in the connection bottleneck. Moreover, the
functionality of GGSN and SGSN was abstracted out
and modeled as traditional ns2 nodes since in
general, they are wired nodes and mimic the
behavior of IP routers. Last, we assume no header
compression at the PDCP (Packet Data Convergence
Protocol) layer.
The network consists of one or several MPEG4
streaming servers that stream videos to one or more
user equipments. The packets sent by the servers
flow through the Core Network and the UTRAN to
arrive at the Node B. This latter uses the MAC-HS
entity to send the packets to the intended user
equipment. For each case of number of user
equipments/streaming servers (1, 5 and 10), we
change the ARQ scheme and collect and analyze the
data. We use three schemes: no ARQ, blind HARQ
with CC, and our adaptive scheme. The streaming
server streams a 10mn MPEG4 encoded video to the
UE. The core network and the UTRAN links have a
high data rate and a very low delay so that this part
of the network does not cause any delay or loss. We
would like to concentrate on the link between the
NodeB and the UEs. The simulated application is
H320 videophone (Halsall, 1996) with a 48 MB
play-out buffer at the receiver.
The simulations were performed on a Rayleigh
fading environment that conforms to the ITU-T
recommendations (Recommendation ITU-R M.1225,
Figure 1 shows the overall frame loss percentage and
the bandwidth delay product for the three techniques
as a function of increasing load/users. One can see
that CC gives the best frame loss percentage while
no ARQ gives the worst. Our adaptive scheme
comes in between but is closer in performance to no
ARQ because of the high number of P and B frames
that are discarded. As for the delay bandwidth
product, we clearly see that our adaptive scheme
incurs little deterioration compared to no ARQ
(which is the one that would give the best results
since no retransmissions take place). We also see
that the gap between CC and the two other schemes
gets bigger as we load the network with more users.
This means that our adaptive scheme lowers the
buffering requirements compared to CC that needs
larger buffers due to delays introduced by
retransmissions especially with high network load.
BW Delay Product & Frame Loss
1 user 5 users 10 users
Network Loa d
Frame Loss (%)
Bandwidth Delay Product (MB)
No ARQ CC Adaptive Scheme No ARQ CC Adaptive Scheme
Frame Loss Bandwidth Delay Product
Figure 1: Performance of various HARQ techniques.
WINSYS 2008 - International Conference on Wireless Information Networks and Systems
Cumulative Jitter
network Load
Min. Cum ulative Jitter (s)
1 user 5 users 10 users
Max. Cumulative Jitter (s)
No ARQ CC Adaptive Scheme No ARQ CC Adaptive Scheme
Min. Cumulative Jitter Max. Cumulative Jitter
Figure 2: Performance of various HARQ techniques.
Figure 2 shows the cumulative jitter values for the
three techniques and for different number of users on
the network. It is clear from this graph that our
adaptive scheme performs much better than CC in a
way that it has a maximum cumulative jitter near the
no ARQ scheme. We also notice that overloading the
network does not increase the gap between our
adaptive scheme and no ARQ as opposed to CC
where the gap grows bigger. As for the minimum
cumulative jitter, we notice that our adaptive scheme
outperforms CC and gives results that are close to no
ARQ. This means less variations and hence better
quality of service.
We have also conducted MOS (Mean Opinion
Score) for the assessment of user-perceived quality
of received media under the three schemes. MOS
provides a numerical indication of the perceived
quality of received media and is expressed as a
single number in the range 1 to 5, 1 being the lowest
perceived quality, and 5 being the highest perceived
quality. As shown in table 1 which summarizes
MOS values for our simulations, the proposed
adaptive scheme brings a clear improvement of the
perceived quality with little additional use of
network resources.
Table 1: MOS analysis (5 independent viewers).
Network Load
ARQ technique
1 user 5 users 10 users
No ARQ 3.2 2.5 0.8
Adaptive Scheme 4.2 3.2 1.5
CC 4.9 4.3 2.6
In this paper, cross-layer optimization has been used
to improve the QoS of streaming MPEG4 video over
the HSDPA network. We made information on the
type of video frame (normally known to the
application layer) available to the MAC-HS layer so
that this latter retransmit erroneous I frames only in
order to minimize delays and jitter. This scheme is
simple to implement since it requires only breaking
the layered architecture and have the MAC-HS layer
access the application payload and get the type of
frame. Overall, the proposed adaptive scheme is able
to provide better QoS and gain of bandwidth at the
expense of a slight degradation in video quality.
Finally, bandwidth gain simply means that the
system is able to support more users; the bandwidth
gain results show that we can gain up to 10% using
our adaptive scheme, meaning that this 10% can be
used by other applications and to support more users.
T. Ahmed et all. “Adaptive Packet Video Streaming Over
IP Networks: A cross Layer
Approach”, IEEE journal on selected areas in
communications, vol. 23, n. 2, feb 2005.
H. Zheng, “Optimizing Wireless Multimedia Transmission
Through Cross Layer Design”, ICME 2003, 185-187.
J. Chen et all. “Joint Cross Layer Design for Wireless QoS
Video Delivery”, ICME 2003 197-200.
S. Yufeng et all., “Cross Layer Techniques for Adaptive
Video Streaming Over Wireless Networks”, IEEE
C2002 277-280.
J. Klaue et. all, “EvalVid - A Framework for Video
Transmission and Quality Evaluation”, http://,
last accessed: Sept. 10, 2007
F. Halsall, “Data Communications, Computer Networks
and Open Systems”, 4
edition, Addison-Wesley 1996.
“Recommendation ITU-R M.1225: Guidelines for
Evaluation of Radio Transmission Technologies for
IMT-2000”, available via:
3gpp specification 3GPP TS 25.308, “3rd Generation
Partnership Project; Technical Specification Group
Radio Access Network; High Speed Downlink Packet
Access (HSDPA); Overall description; Stage 2”
(Release 5)
3gpp specification, 3GPP TS 125 213, “Universal Mobile
Telecommunications System (UMTS); Spreading and
modulation (FDD)”