Xiaofeng Fan, Harvey E. Rhody
Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA
Eli Saber
Dept. of Electrical Engineering, Rochester Institute of Technology, Rochester, NY, USA
Image Registration, Maximization of Mutual Information.
Maximization of Mutual information (MMI) is a method that is used widely for multi-modal image registra-
tion. However, classical MMI techniques utilize only regional and/or global statistical information and do not
make use of spatial features. Several techniques have been proposed to extend MMI to use spatial informa-
tion, but have proven to be computationally demanding. In this paper, a new approach is proposed to combine
spatial information with MMI by using the Harris Corner Label (HCL) algorithm. We use the HCL based
MMI algorithm to accelerate the computation and improve the registration over noisy images. Our results
indicate that the HCL based registration technique yields superior performance on multimodal imagery when
compared to its classical MMI based counterpart.
The availability of remote sensing imagery from satel-
lites and aircraft using many kinds of imaging sen-
sors has led to the need for robust and efficient multi-
modal registration tools. Some imagery examples in-
clude low and high resolution still/video cameras in
the visual spectrum, multi-spectral cameras using a
variety of infra-red wavelengths, imaging spectrome-
ters and synthetic-aperture radar systems. Maximiza-
tion of Mutual Information MMI (Viola, 1995), ini-
tially introduced by Viola, is an automatic registra-
tion method for multi-modal images that exploits the
underlying inherent information relationships. Com-
pared to cross-correlation, it is insensitive to bright-
ness variations that are inherent across modalities.
However, it is somewhat computationally slow and
sensitive to image noise. The technique described in
this paper will address these shortcomings.
The MMI-based image registration represents an
entropy-based measure that does not require the def-
inition of features such as edges or corners and does
not employ spatial information that would be avail-
able in the form of image features. Researchers have
proposed adaptation of the traditional MMI-based
registration framework to incorporate spatial infor-
mation. Butz et al. (Butz and Thiran, 2001) ap-
plied Mutual Information (MI) to edge measures de-
fined by different edge operators. However, the at-
traction range is narrow thereby increasing the dif-
ficulty of the optimization procedure. Pluim et al.
(Plium et al., 2000) proposed including spatial infor-
mation by multiplying the MI measure with an ex-
ternal local gradient term. Holden et al. (Holden
et al., 2004) registered two images by maximizing the
multi-dimensional MI of the corresponding features.
Gan et al. (Gan and Chung, 2005) utilized the spatial
feature, maximum distance-gradient-magnitude, in a
complicated and computationally extensive four di-
mensional framework for image registration.
In this paper, we introduce a spatial feature-based
technique that uses the Harris Corner Label (HCL)
algorithm to identify high-information pixels and a
wavelet pyramid to support computation at different
scales. We propose to calculate the MI from the
HCL map of the original images instead of their cor-
responding intensity values. The experimental re-
sults demonstrate that our method is both more robust
and efficient than that of traditional MMI registration
techniques for multimodal registration. The remain-
der of the paper is organized as follows. Section 2
provides a brief background on spatial feature infor-
Fan X., E. Rhody H. and Saber E. (2007).
In Proceedings of the Second International Conference on Computer Vision Theory and Applications - IFP/IA, pages 420-423
mation and describes our proposed algorithm. Section
3 describes our experimental results. Conclusions are
drawn in Section 4.
Given an image pair and a geometric transformation,
we aim at improving the MMI algorithm by select-
ing the most relevant pixels for registration purposes.
Classical MMI relies only on global statistical infor-
mation. However, for image registration, it is obvious
that edge pixels are more important than those within
a nearly uniform region. A focus-of-attention mech-
anism such as a corner detector provides a excellent
avenue to differentiate between these pixel classes.
2.1 Harris Corner Detector
The Harris corner detector (Harris and Stephens,
1988) is a popular interest-point selector due to its
strong invariance to rotation, translation, illumination
variation and relative tolerance of image noise. It
utilizes the average gradient c(x, y) computed over a
small region w as follows:
c(x, y) =
, y
) I(x
+ x, y
+ y)]
where (x,y) represent the location of a pixel. This can
be expressed in matrix form as
c(x, y) d
M(x, y)d (2)
where d = [x, y]
M =
, y
, y
, y
, y
, y
, y
and I
, y
) = I(x
, y
) I(x
+ x, y
)) are gradi-
ent measures. Hence, c(x, y) represents the gradient
structure of the local neighborhood. The eigenvalues
of the matrix M(x, y) are calculated and represented
as λ
and λ
. The relationship between them, as de-
picted in Figure 1, is an excellent indicator of local
image characteristics.
Given the above, three cases are customarily con-
1. If λ
and λ
are both small, then the local area is
relatively smooth.
2. If one eigenvalue is large and the other small, it
implies that the local auto-correlation function is
ridge shaped indicating a local shift in one direc-
tion. This represents a horizontal or vertical edge.
Figure 1: Possible relationships between eigenvalues pro-
duced by HCL .
(a) (b)
Figure 2: Comparison between HCL and Canny edge de-
3. If both eigenvalues are large, the local auto-
correlation function is sharply peaked, which in-
dicates the presence of a corner.
As a result, we classify pixels by the comparing the
eigenvalues to determine which region of Figure 1 ap-
plies at each point. Each pixel given a class value of
0, 1, or 2 depending on whether it is in a flat region,
near an edge or at a corner.
The Harris corner detector is a good choice as a
focus-of-attention for our approach to MMI registra-
tion. We tried other focus-of-attention methods such
as typical edge detection, but found that the HCL map
provides superior performance. The Harris detector,
in effect, identifies a rich set of pixels that can be used
in the MMI computation. This can be easily seen in
Figure 2.
To achieve the effect of focusing on areas that have
high information content, our algorithm applies MMI
to the HCL map of the images instead of the origi-
nal intensity. Because it is computed from the local
auto-correlation map, HCL also provides a level of
filtering. This property helps to increase the capture
range in the registration procedure.
2.2 Mmi on Hcl Mapping
In our implementation of the MI-based multi-modal
image registration, MI is computed on imagery that
has been pre-processed by the HCL. To this effect, a
Harris Corner Detector is utilized to divide all pixels
into three categories: inside pixel, edge and corner.
The joint histogram calculation among the labels has
Figure 3: Flowchart for the HCL-Based MI registration.
a smaller range of values to consider than the calcu-
lation based on the intensity. In addition, we use a
4-level-wavelet-pyramid to accelerate the registration
process and increase its robustness.
In this section, we present two sets of experiments.
The first set is utilized to test the robustness of HCL-
based MI with regard to noise. The second set is em-
ployed to evaluate the performance of our proposed
method relative to traditional MI-based registration.
3.1 Synthetic Images
Although conventional MMI is very powerful, its per-
formance is reduced in images with large regions of
uniform intensity and regions of high randomness.
This can be demonstrated by an extreme case. In Fig-
ure 4, we show an image of a noisy object on a flat
background and vice versa. In Figure 4a, the back-
ground is constant, and the object has spatially inde-
pendent random pixels. Figure 4b has a reversal of the
structure in Figure 4a. When Figure 4a and Figure 4b
are aligned, every random pixel in one image is con-
stant in the other and vice versa. This makes the pix-
els in one image statistically independent of its coun-
terpart in the other image. Hence, any misalignment
leads to overlaps of like regions which poses signifi-
cant challenges for conventional MMI registration.
On the other hand, the images shown in Figure
4 are easily processed by a HCL-Enhanced MI. Al-
though the pixel intensity is random inside the region,
sufficient gradient information is detected to enable
MMI registration. Figure 4c and d show the HCL map
at one of the pyramid layers. Figure 4e depicts the re-
sulting MMI surface achieved from Figure 4a and 4b,
as a function of horizontal and vertical shifts, while
Figure 4f represents the MMI surface achieved with
Figure 4c and d. As can be easily seen, the MI sur-
face obtained from the HCL based technique is far
(a) (c) (e)
(b) (d) (f)
Figure 4: Noisy image registration Example: (a) and (b)
represent two noisy images; (c) and (d) shows the HCL
mapping for two images; (e) and (f) shows the search sur-
face for conventional MMI and HCL-enhanced MMI regis-
tration algorithm.
Table 1: WASP camera description.
Bandwidth Resolution
SWIR 0.9-1.7 µm 640×510
MWIR 3-5 µm 640×510
LWIR 8-9.2 µm 640×510
VNIR 0.4-0.9 µm RGB 2048×2048
well suited for registration that its counterpart.
3.2 Real Life Multimodal Images
The real life multimodal images used in this section
were captured by the Wildfire Airborne Sensor Plat-
form (WASP) system, which includes three IR cam-
eras and a high-resolution visible camera. The speci-
fications of the system are shown in Table 1.
3.2.1 Intra-band Registration
The images shown in Figure 5a and 5b represent two
frames taken 4 seconds apart from the short wave
IR band of the WASP system. Figure 5c shows the
corresponding registration curves for the two algo-
rithms, where the solid is for HCL-MMI algorithm
and the dashed is for the conventional MMI. The reg-
istered images are overlaid (See Fig. 5d to indicate
the high degree of accuracy obtained by our proposed
3.2.2 Inter-band Registration
In this section, we demonstrate the effectiveness of
the HCL-MMI algorithm in registering multi-modal
imagery acquired by the WASP system using the three
IR bands discussed in Table 1. The short, medium and
The running environment is P4 3.0G 1G RAM, IDL 6.0
(a) (b)
(c) (d)
Figure 5: Intra-band Registration Example: (a) and (b) rep-
resent two sequential frames, (c) Registration Curve for two
algorithms, (d) Overlaid registered images.
Table 2: Results for WASP Intra-band Registration.
Conventional MMI HCL-Enhanced MMI
θ x y Time (s) x y Time (s)
0 -8 237 14.04 -8 237 7.753
0 65 232 12.765 65 232 7.422
0 10 104 11.547 10 104 6.656
-0.5 19 -236 71.75 19 -236 39.329
0.5 27 -251 84.26 27 -251 45.23
0 4 -246 15.531 4 -246 7.859
long IR images are shown in Figure 6a, b and c re-
spectively. Note the translation, scale and significant
intensity variations. The results computed by our al-
gorithm are shown in Fig. 6d and displayed in Table
3, where the short wave band was utilized as the base
From the above examples, we can see, relative to
the conventional MMI algorithm, that the HCL-MMI
algorithm can reach the same accuracy in about half
the computing speed. This was found to be true for
many other examples that have been tested.
(a) (b)
(c) (d)
Figure 6: Inter-band Registration Example.
Table 3: Inter-band Registration Results.
Conventional MMI HCL-Enhanced MMI
scale x y time scale x y time
MWIR 1.13 19 32 51.06 1.13 19 32 29.68
LWIR 1.12 6 36 46.16 1.12 6 36 26.6
In this paper, we proposed a HCL-based MMI algo-
rithm for registering multi-modal images. The HCL-
based MMI is reliable and efficient. The Harris corner
label improves robustness as demonstrated with vari-
ous synthetic and real life images. Because the Harris
corner detector is invariant to the translation and ro-
tation, the HCL-MMI algorithm can register images
with shift, rotation and scale differences. The Har-
ris corner detector broadens the attraction range and,
in our experience, reduce the risk of being trapped
in a local minimum. Experimental results show the
algorithm is successful in registering IR with visual
images. The use of a multi-resolution technique in-
creases robustness by enabling computation at an ap-
propriate scale.
Butz, T. and Thiran, J.-P. (2001). Affine registration with
feature space mutual information. In MICCAI ’01:
Proceedings of the 4th International Conference on
Medical Image Computing and Computer-Assisted In-
tervention, pages 549–556, London, UK. Springer-
Gan, R. and Chung, A. C. S. (2005). Multi-dimensional mu-
tual information based robust image registration us-
ing maximum distance-gradient-magnitude. In IPMI,
pages 210–221.
Harris, C. and Stephens, M. (1988). A combined corner and
edge detector. In 4th Alvey Vision Conference,, pages
Holden, M., Griffin, L. D., and Hill, D. L. G. (2004). Multi-
channel mutual information using scale space. In mic-
cai04, St. Malo, France.
Plium, J., Maintz, J., and Viergever, M. (2000). mage regis-
tration by maximization of combined mutual informa-
tion and gradient information. IEEE Trans. Medical
Imaging, 19(8):809–814.
Viola, P. A. (1995). Alignment by Maximization of Mutual
Information. PhD thesis, Massachusetts Institute of
Technology Artificial Intelligence Laboratory.