
 
in the enhancement layer the ROI is the only useful 
image area. Therefore spatial and quality scalability is 
only achieved for the ROI, which should contain the 
image area of interest for target applications. In the 
following sections, the Rate-Distortion and Complexity 
performance of two methods, compliant with 
H.264/SVC, is evaluated and compared with 
straightforward encoding without ROI. 
2  H.264/SVC ROI WITH SPATIAL 
SCALABILITY 
The underlying idea to achieve efficient encoding of 
the ROI in the higher resolution layer is to minimise 
the number of bits spent in the background region of 
the higher resolution images. In the base layer there is 
no distinction between ROI and background. One of 
the methods proposed in this work is based on coarse 
quantisation of the background region and finer 
quantisation of the ROI in the high resolution layer. In 
this method, the macroblocks (MBs) of the background 
region, i.e., outside the ROI, are encoded with the 
maximum quantisation scale allowed by H.264/SVC 
(Qp=51) in order to maximise the number of null 
coefficients. The other method is based on setting to 
zero the transform coefficients of the MBs outside the 
ROI regardless their value. Note that in this case 
quantisation is avoided for these MBs. In both 
methods, the ROI is defined by a mask, providing a 
ROI map (ROImap) which is used by the encoder to 
identify the ROI MBs though it is not encoded into the 
video stream. 
2.1 QP
51
 Outside ROI 
The functional implementation of this method is 
depicted in Figure 1. In each MB of the high resolution 
layer, the QP value is switched between 51 and the QP 
value selected for the current MB, either for MBs 
located outside the ROI or within the ROI, 
respectively. The ROI is not defined in the base layer, 
thus the whole image is normally encoded at a lower 
resolution. 
Therefore, the quality of ROI MBs is much higher 
than that of the MBs outside the ROI and consequently 
most of the bits used in the high resolution layer are 
assigned to the ROI. Note that in the high resolution 
layer the only useful information that needs to be 
encoded is the ROI itself, because the lower quality and 
resolution of the background region provided by the 
base layer should be enough for the envisaged 
application. 
 
 
Figure 1: Qp
51
functional diagram. 
2.2 Set-to-Zero 
The objective of this method is the same as the 
previous one: to spend no bits in the MBs outside the 
ROI and to increase the subjective quality of ROI in the 
higher resolution layer. In the Set-to-Zero method, the 
transform coefficients of residual blocks are set to zero 
for those MB outside the ROI. Thus, the encoder sets 
the syntax element coded block pattern (CBP) to 0. The 
Figure 2 shows Set-to-Zero functional diagram. 
 
Figure 2: Set-to-Zero diagram. 
3 SIMULATION RESULTS 
The performance of the two methods described in the 
previous section was evaluated in regard to rate-
distortion and encoding complexity. Separate 
experiments were carried out for Intra and Inter coding 
modes. The proposed methods were implemented using 
the JVT reference software, version 8.9, as a basis 
framework. The test sequence “Mobile” was used in 
the experiments with two layers QCIF@30fps (base 
layer), CIF@30fps (enhancement layer) and two ROIs 
(ROI1, ROI2) with different sizes were used. ROI1 is a 
192x144 pel image region covering the area of the 
calendar numbers and ROI2 is the whole calendar, as 
shown in Figure 3. 
In the experiments the following settings were used 
for the Intra test: two spatial layers (QCIF and CIF) at 
30fps;  NumberReferenceFrames 1; FastSearch;  Loop 
Filter on. The coding parameters were as follow: for 
the base layer: CABAC;  Basic QP 35; FRExt no; for 
layer 1: CABAC;  InterLayerPred on;  FRExt on. The 
Inter tests the were used: two spatial layers (QCIF and 
CIF); 30 frames; NumberReferenceFrames 1; 
FastSearch; Loop Filter on; MaxDelay 1200; GOPsize 
H.264/SVC ROI ENCODING WITH SPATIAL SCALABILITY
213