Authors:
Muhammad Ali
and
Hassan Foroosh
Affiliation:
University of Central Florida, United States
Keyword(s):
Natural Scene Text Recognition, Tensors, Rank-1 Decomposition, Holistic Character Recognition, Feature Independence.
Abstract:
Current methods in scene character recognition heavily rely on discriminative power of local features, such
as HoG, SIFT, Shape Contexts (SC), Geometric Blur (GB), etc. One of the problems with this approach is
that the local features are rasterized in an ad hoc manner into a single vector perturbing thus spatial
correlations that carry crucial information. To eliminate this feature dependency and associated problems, we
propose a holistic solution as follows: For each character to be recognized, we stack a set of training images
to form a 3-mode tensor. Each training tensor is then decomposed into a linear superposition of ‘k’ rank-1
matrices, whereby the rank-1 matrices form a basis, spanning solution subspace of the character class. For a
test image to be classified, we obtain projections onto the pre-computed rank-1 bases of each class, and
recognize it as the class for which inner-product of mixing vectors is maximized. We use challenging natural
scene character datasets, namely
Chars74K, ICDAR2003, and SVT-CHAR. We achieve results better than
several baseline methods based on local features (e.g. HoG) and show leave-random-one-out-cross validation
yield even better recognition performance, justifying thus our intuition of the importance of feature-independency
and preservation of spatial correlations in recognition.
(More)