Fabio Mart´ınez, Juan Carlos Le´on and Eduardo Romero
BioIngenium Research Group, National University of Colombia, Bogot´a, Colombia
Gesture recognition, Human motion analysis, Gait analysis, Markerless approach.
Gait patterns may be distorted in a large set of pathologies. In the clinical practice, the gait is studied using
a set of measurements which allows identification of pathological disorders, thereby facilitating diagnosis,
treatment and follow up. These measurements are obtained from a set of markers, carefully placed in some
specific anatomical locations. This conventional procedure is obviously invasive and alters the natural move-
ment gestures, a great drawback for diagnosis and management of the early disease stages, when accuracy
is a crucial issue. Instead, markerless approaches attempt to capture the very nature of the movement with
practically no intervention on the movement patterns. These techniques remain still limited concernig their
clinical applications since they do not segment with sufficient precision the human silhouette. This article
introduces a novel markerless strategy for classiying normal and pathological gaits, using a temporal-spatial
characterization of the subject from 2 differents views. The feature vector is constructed by associating the
spatial information obtained with SURF and the temporal information from a Σ-operator. The strategy was
evaluated in three groups of patients: normal, musculoskeletal disorders and parkinsons disease, obtaining a
precision and a recall of about 60%
Distortion of gait patterns are the first clinical man-
ifestation of many diseases, among others diabetes,
brain palsy or accident sequelae. The analysis of hu-
man gait attempts to objectively assess pathologies by
following up the hidden gait dynamic variables. The
set of techniques dedicated to perform this analysis
is what is currently known as the gait laboratory, a
tool devised to quantify a disease and to compare the
gait with normal patterns (Perry and Burnfield, 2010),
(Haiyan Luo and et al., 2010). Most of this gait anal-
ysis is carried out with a set of markers, carefully
placed upon some specific anatomical locations. This
conventional procedure is invasive and alters the natu-
ral movement gestures, necessitating strong variations
to achieve diagnosis, i.e., this approach is hardly use-
ful in early stages.
On the other hand, gait dynamic patterns are by
nature highly variable and can be easily contaminated
with noise. In early stages, most of these diseases
differ by very little from what is considered a nor-
mal pattern so that classification is a very challeng-
ing problem, even for the expert clinicians. This pic-
ture may be worsen if one considers that the basic ex-
amination tool, the markers, can move very easily or
can evenbe unobservable,contaminating the resulting
measurement. These factors together lead to subjec-
tive clinical analyses with the consequent limitation
in the reproduction of the clinic management of the
patient (Kamruzzaman and Begg, 2006), (Wolf and
et al, 2006).
Ultimately, this problem has undergone a funda-
mental transformation since the objective is not any-
more the movement reconstruction from the anatom-
ical markers, but the accurate tracking of the move-
ment pattern i.e. the markerless strategy. Research
areas as computer vision, automatic surveillance, ani-
mation and image processing have already developed
some markerless strategies for diverse applications,
namely, biometric identification, abnormal motion
detection, scene reconstruction and activity classifi-
cation (Turaga et al., 2008), (Klempous, 2009). How-
ever, there are several problems related to extracting
the object of interest from some escenaries, mainly
due to the blurred boundaries between the background
and foreground(Cristani et al., 2010), (McHugh et al.,
2009), an issue that can result in wrong characteriza-
This article presents an efficient markerless
methodology to identify and classify different kinds
of normal and pathological movements. A non lin-
ear Sigma-Delta (Σ-) operator is used to obtain a
temporal movement description as a set of pixels.
Martínez F., Carlos León J. and Romero E..
DOI: 10.5220/0003375907100713
In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2011), pages 710-713
ISBN: 978-989-8425-47-8
2011 SCITEPRESS (Science and Technology Publications, Lda.)
Most of them correspond to a particular patient shape
while some small scattered groups belong to the back-
ground. Afterwards, we compute a bounding box
around of largest group and therein we calculate some
local features per frame, using the “Speeded Up Ro-
bust Features”(SURF). A weighting function allows
associating some of these spatial features with rel-
evant temporal information. This weighted feature
vector is used to classify patterns as normal or patho-
logical, applying a classical Support Vector Machine
strategy. Evaluation was performed on a database
with 96 videos from 32 patients, with three types of
movements: normal, musculoskeletal disorders and
Parkinson’s disease. Sensitivity and specificity are
used to assess the utility of this method. This paper is
organized as follows: section 2 briefly outlines the na-
ture of the dataset, section 3 introduces the proposed
markerless strategy, section 4 sumarizes the results &
the effectiveness of the proposed method, finally sec-
tion 5 concludes with a discussion and possible future
Experimentation was carried out with video se-
quences recorded from 3 views frontal, lateral and
45 degree view, registered at the gait laboratory of
the National University of Colombia, under semi-
controlled illumination conditions. This dataset con-
sists of a set of videos captured from 20 patients, each
one was recorded 4 times while walking, for a total
of 240 video sequences. The Dataset was divided as
8 patients diagnosed with musculoskeletal disor-
ders for a total of 13500 frames.
7 patients diagnosed with parkinsons disease (No
depressive disorder present) for a total of 15500
5 patients with normal gait for a total of 14000
Our proposed method begins calculating the tempo-
ral information using a Σ operator. A bounding
box is superimposed upon the region with the largest
rate of change and the local features are calculated,
within this box, using SURF. A weighting function
chooses the more relevant SURF features, those with
a similar spatial location to the pixels detected by the
Σ operator, i.e., the features that contain temporal
and spatial information. The obtained feature vector
is used to classify patterns as normal or pathological,
applying a classical SVM, as illustrated in figure 1.
3.1 Σ Temporal Estimator
Temporal description of the patient gait patterns is
central at describing structural changes. Many strate-
gies have been proposed already, they are currently
known as background estimation methods (Elgam-
mal et al., 2000), (Manzanera and Richefeu, 2007),
(Howe and Deschamps, 2004). These methods use
a sequence of images I
and build up a model of the
static scene M
. The model output is an image D
where the backgroundis represented by D
(x) = 0 and
the foreground is D
(x) = 1.
Algorithm 1: Σ Algorithm.
Initialization: M
(x) = I
for each Frame t do
(x) = M
(x) + sgn(I
x M
(x) = |M
(x) I
end for
Initialize: V
(x) =
for each Frame t do
for each pixel x such that
(x) 6= 0 do
(x) = V
(x) + sgn(N ×
(x) V
(x) < V
(x) then
(x) = 0
(x) = 1
end if
end for
end for
In our dataset the silhouette extraction is a difficult
task because of the similarity between the foreground
and the background. Hence we use a non linear Σ
operator to obtain a motion descriptor which detects
the most probable localization of the foreground. This
estimator oversamples a signal at higher rates than the
especified by the Nyquist teorem, increasing correla-
tion between the adjacent frames at each pixel (Man-
zanera and Richefeu, 2007). The Σ operator be-
havesas a background tracker M
(x), dynamically up-
dated by comparing each image I
(x) with the current
background M
(x), using a simple updating rule: If
(x) is greater (lower) than M
(x), then a positive in-
crease (decrease) + is performed. The implemented
Σ is shown in the Algorithm 1.
Upon the region with the largest movement pat-
tern, we compute a center of mass, on top of which
we place a bounding box that contains the object of
interest. This process is speeded up using an integral
image representation of the original images, reduc-
Figure 1: The markerless strategy consists in determining a feature vector to describe normal and pathological movement,
using a temporal-spatial gait characterization. Motion is classified using a Support Vector Machine strategy.
ing the computational cost by 94% (Viola and Jones.,
3.2 Speeded up Robust Features
Once the bounding box is extracted, we calculate
some local features of it using the Speeded Up Robust
Features (SURF) descriptor (Herbert Bay and Gool,
2008). This descriptor highlights the salient points
within the bounding box so that each salient point is
described by magnitude, orientation and feature vec-
tors. The SURF method provides invariant image de-
scription, allowing a robust representation against il-
lumination, scale and rotation changes, a useful as-
pect in our problem due to the semi-controlled sce-
nario, different views and patients.
The SURF description is obtained by initially
computing the Hessian matrix H(X, σ), as follows: -
H(X,σ) =
(X, σ) L
(X, σ)
(X, σ) L
(X, σ)
where X is a especific point, σ is the scale and
(X, σ) is the second Gaussian convolution. This
step relies on an integral image to reduce the compu-
tational time. Afterwards, SURF constructs a circular
region surrounding the points of interest, attempting
to assign a unique orientation by estimating the Haar
wavelet coefficients in both directions and thereby
gaining invariance to image rotations. SURF descrip-
tors are thus constructed by extracting square regions
around the points of interest, which are divided in four
3.3 Feature Extraction
SURF features are used to obtain a summarization
of the gait sequence, they operate exclusively on the
bounding boxes. Once the set of SURF features is
calculated, the values of the SURF descriptor vector
are weighted, following the pixel intensity distribu-
tion obtained from the Σ operator. Higher val-
ues are assigned to vectors whose locations belong to
regions with high movement. The proposed summa-
rization is a collection of weighted vectors, arranged
acording to their frame number, on the gait sequence.
As the SURF features produce a variable number
of points of interest for different squences, the final
descriptor of a gait sequence is obtained at quantiz-
ing the complete set of vectors into 5,10,20,40 and
50 clusters using the Expectation Maximization algo-
rithm yielding 5 different descriptors for a single se-
Classifcation was performed using a Support Vector
Machine (SVM), trained with a set of attribute vec-
tors, extracted from labeled gait sequences. In this
phase, two types of kernels were used, polynomial
and Radial Basis Function (RBF) kernels. A sensi-
tivity analysis of the parameters, gamma (RBF ker-
nels) and the exponent (polynomial kernels), were es-
timated using the sequential minimal optimization al-
gorithm (Flake and Lawrence, 2001), the parameter
which yielded the larger number of true positives.
Table 1 showsthe precision, recall, and sensitivity,
obtained with either the RBF or the polinomial kernel.
Overall, the SVM strategy shows precision and re-
VISAPP 2011 - International Conference on Computer Vision Theory and Applications
Table 1: Table shows the precision, recall and sensitivity
for the different evaluated classes, i.e., the musculo-skeletal
disorder (M), the normal pattern (N) and the parkinsonian
gait (P), using both the RBF and polynomial kernels.
Class Precision Recall
RBF Poly RBF Poly
M 0.67 0.75 0.33 0.75
N 0.6 0.7 0.95 0.66
P 0.72 0.61 0.41 0.64
call figures above 0.6, except for the musculo-skeletal
patterns, for which the RBF is 0.33, a very large dif-
ference that can be attributed to the fact that the group
of musculo-skeletal is composed of a larger number
of patterns and therefore the variance is much larger.
The RBF kernel shows a recall of 0.95 for the nor-
mal group, indicating that the RBF kernel works bet-
ter with the data with smaller variance. Of course
the fact that the normal group was the larger group (8
cases, compared with 7 and 6) can bias these results,
together with the fact that the chosen parameters were
set by the fact that they detected the larger number of
true positives.
Table 2: Confusion Matrix using RBF and polynomial ker-
nels for the three evaluated classes.
Class M N P
RBF Poly RBF Poly RBF Poly
M 4 9 5 1 3 2
N 1 2 20 14 0 5
P 1 1 8 5 8 11
Likewise the confusion matrix shows that correla-
tion between the nomal class is the higher.
This paper has introduced a novel markerless method
that allows to characterize normal and pathological
human gait patterns. The whole markerless strategy
consists in determining a feature vector for describing
normal and pathological movement, using a temporal-
spatial gait characterization from 3 differents views.
The feature vector is constructed by associating the
spatial information obtained from SURF and the tem-
poral information from a Σ operator. Motion is
classified using a classical Support Vector Machine
strategy. Results demonstrate that this method can
complement the conventional gait analysis since it as-
signs objective pattern measurements. The method-
ology presented in this work constitutes a first ap-
proximation to understanding the complex dynamic
of the gait. From this kind of analyzes, we expect it
would be possible to set up an assembly of descriptors
which allow to accurately describe motions patterns
and quantify gait semantics.
Cristani, M., Farenzena, M., Bloisi, D., and Murino,
V. (2010). Background subtraction for automated
multisensor surveillance: A comprehensive review.
EURASIP Journal on Advances in Signal Processing,
Elgammal, A., Harwood, D., and Davis, L. (2000). Non-
parametric model for background subtraction. pages
Flake, G. W. and Lawrence, S. (2001). Efficient svm regres-
sion training with smo.
Haiyan Luo, S. C. and et al., D. W. (2010). A remote mark-
erless human gait tracking for e-healthcare based on
content-aware wireless multimedia communications.
IEEE Wireless Communications,.
Herbert Bay, Andreas Ess, T. T. and Gool, L. V. (2008).
Speeded-up robust features (surf). Comput. Vis. Image
Underst, 110:346359.
Howe, N. R. and Deschamps, A. (2004). Better Foreground
Segmentation Through Graph Cuts. ArXiv Computer
Science e-prints.
Kamruzzaman, J. and Begg, R. K. (2006). Support vector
machines and other pattern recognition approaches to
the diagnosis of cerebral palsy. IEEETrans. Biomed.
Eng., 53:2479–2490.
Klempous, R. (2009). Biometric motion identification
based on motion capture. 243:335–348.
Manzanera, A. and Richefeu, J. (2007). A new motion
detection algorithm based on [sigma]-[delta] back-
ground estimation. 28(3):320–328.
McHugh, J., Konrad, J., Saligrama, V., and Jodoin, P.-M.
(2009). Foreground-adaptive background subtraction.
Signal Processing Letters, IEEE, 16(5):390 –393.
Perry, J. and Burnfield, J. M. (2010). Gait Analysis: Normal
and Pathological Function. NJ.Slack.
Turaga, P., Chellappa, R., Subrahmanian, V. S., and Udrea,
O. (2008). Machine recognition of human activities:
A survey. Circuits and Systems for Video Technology,
IEEE Transactions on, 18(11):1473–1488.
Viola, P. and Jones., M. J. (2004). Robust real-time face
detection. Int. J. Comput. Vision,, 57:137 154.
Wolf, S. and et al, T. L. (2006). Automated feature assess-
ment in instrumented gait analysis. Gait and Posture,