Enhancing Emotion Recognition from ECG Signals using Supervised
Dimensionality Reduction
Hany Ferdinando
1,2
, Tapio Seppänen
3
and Esko Alasaarela
1
1
Optoelectronics and Measurement Technique Research Unit, University of Oulu, Oulu, Finland
2
Department of Electrical Engineering, Petra Christian University, Surabaya, Indonesia
3
Physiological Signal Analysis Team, University of Oulu, Oulu, Finland
{hany.ferdinando, tapio, esko.alasaarela}@ee.oulu.fi
Keywords: Emotion Recognition, kNN, Dimensionality Reduction, LDA, NCA, MCML.
Abstract: Dimensionality reduction (DR) is an important issue in classification and pattern recognition process. Using
features with lower dimensionality helps the machine learning algorithms work more efficient. Besides, it
also can improve the performance of the system. This paper explores supervised dimensionality reduction,
LDA (Linear Discriminant Analysis), NCA (Neighbourhood Components Analysis), and MCML (Maximally
Collapsing Metric Learning), in emotion recognition based on ECG signals from the Mahnob-HCI database.
It is a 3-class problem of valence and arousal. Features for kNN (k-nearest neighbour) are based on statistical
distribution of dominant frequencies after applying a bivariate empirical mode decomposition. The results
were validated using 10-fold cross and LOSO (leave-one-subject-out) validations. Among LDA, NCA, and
MCML, the NCA outperformed the other methods. The experiments showed that the accuracy for valence
was improved from 55.8% to 64.1%, and for arousal from 59.7% to 66.1% using 10-fold cross validation after
transforming the features with projection matrices from NCA. For LOSO validation, there is no significant
improvement for valence while the improvement for arousal is significant, i.e. from 58.7% to 69.6%.
1 INTRODUCTION
Decreasing the dimensionality of features without
losing their important characteristic is a vital pre-
processing phase in high-dimensional data analysis
(Sugiyama, 2007). Dimensionality reduction (DR) is
an important tool to handle the curse of
dimensionality. Projecting high dimensional feature
space to lower dimensional feature space helps
classifiers perform better. As human vision system is
limited to 3D, visualization of feature space gets
benefits from DR. Moreover, DR is also useful in data
compression (Lee and Verleysen, 2010), for example
when it is important to store all training data as in k-
nearest neighbour classifier (kNN).
Dimensionality reduction (DR) methods include
linear and nonlinear techniques. Well known method
for linear DR is principal component analysis (PCA)
(Jolliffe, 2002). The nonlinear DR emerged later, e.g.
Sammon’s mapping (Sammon, 1969). Furthermore,
there are supervised and unsupervised DR techniques.
The supervised DRs use labels of the data to guide the
mapping process while the unsupervised ones rely on
finding a projection space which provides the highest
variance.
This paper explores a number of supervised DR
techniques, i.e. Neighbourhood Components
Analysis (NCA), Linear Discriminant Analysis
(LDA), Maximally Collapsing Metric Learning
(MCML), and applied them to enhance the accuracy
of emotion recognition-based ECG signal from the
Mahnob-HCI database for affect recognition.
The Mahnob-HCI database was published in 2012
with some baseline accuracies (Soleymani, et al.,
2012) for 3-class classification problem of valence
and arousal. However, a baseline for emotion
recognition based on ECG signals only were not
given therein. Ferdinando et al. (Ferdinando, et al.,
2014) computed Heart Rate Variability (HRV)
indexes achieving baseline accuracies, 42.6% and
47.7% for valence and arousal respectively. Later,
Ferdinando et al. improved the accuracy to 55.8% and
59.7% for valence and arousal respectively by
applying bivariate empirical mode decomposition
(BEMD) to ECG signals and use the statistical
distributions of dominant frequency as the features
(Ferdinando, et al., 2016).
112
Ferdinando, H., Seppänen, T. and Alasaarela, E.
Enhancing Emotion Recognition from ECG Signals using Supervised Dimensionality Reduction.
DOI: 10.5220/0006147801120118
In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2017), pages 112-118
ISBN: 978-989-758-222-6
Copyright
c
2017 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
Although significant improvements have been
achieved in (Ferdinando, et al., 2016), the best
accuracies, so far, from this database were 76% and
68% for valence and arousal respectively (Soleymani,
et al., 2012) using features from eye gaze and EEG.
We aim at improving the classification accuracy by
using only ECG signals.
In this paper, we enhance the accuracy of emotion
recognition by applying supervised DR to the features
based on applying BEMD analysis to ECG signals
(Ferdinando, et al., 2016) prior feeding them to the
kNN classifier. Projection matrix calculations were
done with the Matlab code by van der Maaten (van
der Maaten, 2016).
2 SUPERVISED
DIMENSIONALITY
REDUCTION
Supervised DRs in drtoolbox are Linear Discriminant
Analysis (LDA), Generalized Discriminant Analysis
(GDA), Neighbourhood Components Analysis
(NCA), Maximally Collapsing Metric Learning
(MCML), and Large Margin Nearest Neighbor
(LMNN) (van der Maaten, 2016). They work based
on the label/class of the inputs. The labels serve as a
guideline to reduce the dimensionality. The
supervised DR methods in this exploration are based
on a Mahalonobis distance measure
 
2121
2
21
xxAxxxx
T
ff
(1)
within kNN framework, except LDA and GDA,
where
WWA
T
is a positive semidefinite (PSD)
matrix, and W is the projection matrix to a certain
space. The ultimate goal is to find projection matrix
A, such that the classifiers perform well in the
transformed space. Unfortunately, the GDA does not
provide a projection matrix A such that new features
can be transformed into other space but user can
choose the target dimensionality (van der Maaten,
2016). For this reason, GDA was not included in our
study. Looking to the implementation of LMNN,
there is no such dimensionality reduction but it
provides a projection matrix A (van der Maaten,
2016). Due to this fact, the LMNN was also discarded
from the experiments.
2.1 Linear Discriminant Analysis
(LDA)
Linear Discriminant Analysis (LDA) (Weinberger
and Saul, 2009) computes linear projection
ii
xx A
that maximizes the amount of between-class variance
(C
b
) relative to the amount of within-class variance
(C
w
). The objective function is defined as
ACA
ACA
w
T
b
T
TraceAf )(
subject to
IAA
T
(2)
The LDA DR works well when the reduced
dimensionality is less than the number of classes. In
addition, the conditional densities of the classes must
be multivariate Gaussian. Failing to fulfil this
requirement makes the transformed features not
suitable for kNN. This method has been applied to
spoken emotion recognition problem (Zhang and
Zhao, 2013), EEG-based emotion recognition
(Valenzi, et al., 2014), and ECG-based individual
identification (Fratini, et al., 2015).
2.2 Neighbourhood Components
Analysis (NCA)
Neighbourhood Component Analysis (NCA)
(Goldberger, et al., 2005) is non-parametric which
makes no assumption about the shape of the class
distribution or the boundaries between them. The
algorithm directly maximizes a stochastic variant of
the leave-one-out kNN score on the training set. The
final goal is to find a transformation matrix such that
in the transformed space, the kNN performs well. The
size of the transformation matrix determines the
dimension of the transformed features. Using this
method, one can visualize high dimensional features
in 2D or 3D space.
To deal with the discontinuity of the leave-one-
out classification error of kNN, a differentiable cost
function based on stochastic (“soft”) neighbour
assignment in the transformed space was introduced
(Goldberger, et al., 2005). The idea is to use softmax
function, to transform distance from point i to j into
probability p
ij
and inherit its class label from the
selected point.

ik
ki
ji
ij
p
2
2
exp
exp
AxAx
AxAx
,
0
ii
p
(3)
with objective function defined as
Enhancing Emotion Recognition from ECG Signals using Supervised Dimensionality Reduction
113

i
i
iCj
ij
ppAf
i
)(
(4)
The algorithm searches for the transformation matrix
A, such that the objective function is maximized. The
algorithm uses a gradient rule, by differentiating f(A)
with respect to the transformation matrix A, for
learning. The NCA was able to separate data
containing useful information and noise, which ended
up with dimensionality reduction (Goldberger, et al.,
2005).
The NCA has been applied to research in
Affective Computing, e.g. Zhang and Zhao applied it
to the spontaneous Chinese and the acted Berlin
database for spoken emotion recognition and then
compared it with other dimensionality reduction
methods (Zhang and Zhao, 2013). McDuff et al. used
the NCA in AffectAura project (McDuff, et al.,
2012). Romero et al. put on the NCA to reduce
dimensionality of features from EEG (Romero, et al.,
2015).
2.3 Maximally Collapsing Metric
Learning (MCML)
Maximally Collapsing Metric Learning (MCML)
(Globerson and Roweis, 2006) uses simple geometric
intuition that all points belonging to the same class
are mapped (collapsed) to a single location in feature
space and all points from the other classes are mapped
to other locations. The main goal is to find a
transformation matrix A such that it fulfills the simple
geometric intuition idea. To learn the distance
measure, each training point is assigned to a
conditional probability,

A
ij
A
pijp |
, over other
points using softmax function. From conditional
probability point of view, the probability of a point
belonging to class X given that point is in class X is
1, otherwise it is zero. Given pairs of input and label

ii
yx ,
, the conditional probability is defined as
ij
ij
ij
yy
yy
p
,0
,1
*
(5)
The algorithm searches for a matrix A such that
A
ij
p
is as close as possible to
*
ij
p
by minimizing
objective function f(A), i.e. Kullback-Leibler
divergence between them, such that
PSDA
. The
objective function (Globerson and Roweis, 2006) is
defined as

i
A
ijpijpKLAf |||)(
0
(6)
The MCML has been applied to spoken emotion
recognition (Zhang and Zhao, 2013) and EEG-based
Iyashi expression analysis (Romero, et al., 2015).
3 MATERIAL AND METHODS
3.1 ECG Signal Processing
The Mahnob-HCI database contains 32-channel
EEG, peripheral physiological signals (ECG,
temperature, respiration, skin conductance), face and
body video, speech, and eye gaze recording from 27
subjects (11 males and 16 females). All signals were
precisely synchronized which is suitable for
multimodal emotional response studies. The ECG
signals were sampled at 256 Hz (Soleymani, et al.,
2012).
We used the same data as in (Ferdinando, et al.,
2014), i.e. “Selection of Emotion Elicitation” in the
database. The original data contains 513 samples.
However, the sample from session 2508 was
discarded because visual inspection showed it is
corrupted. Thus, we worked with 512 samples,
subject to several filters to suppress noise from power
line interference, baseline drift, motion artifact,
electrode contact, and muscle contraction
(Soleymani, et al., 2012).
The ECG signals contain data from both
unstimulated and stimulated phase. Since we were
only interested in ECG during stimulated phase, this
part must be separated from the other utilizing
synchronization signal provided by the database.
The BEMD method (Rilling, et al., 2007) was
used to get features from ECG. Based on our
experiments, the BEMD method was sensitive to the
length of the signal. For this reason, the ECG signal
was divided into 5 second segments. A synthetic ECG
signal, synchronized with the R-wave event to the
original signal, was generated by using the model
from McSharry et al. (McSharry, et al., 2003). This
signal served as the imaginary part of the ECG signal
while the original served as the real part. This
complex-valued ECG signal was analyzed by the
BEMD method, resulting in 5-6 intrinsic mode
functions (IMFs). The first three IMFs, as suggested
by Agrafioti et al. (Agrafioti, et al., 2012), were
analyzed for dominant frequencies using spectrogram
analysis (Ferdinando, et al., 2016). The spectrogram
analysis relies on two parameters, i.e. window size
ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
114
and overlap. The dominant frequencies of all 5 second
segments are collected and various features are
calculated as follows. The features are based on the
statistical distribution of the dominant frequencies
and their first difference: mean, standard deviation,
median, Q1, Q3, IQR, skewness, kurtosis, percentile
2.5, percentile 10, percentile 90, percentile 97.5,
maximum, and minimum. The results are groups in
three sets: feature1 (statistical distribution of the
dominant frequencies; 84 features), feature2
(statistical distribution of the dominant frequencies’
first difference; 84 features), and feature12 (combine
both feature1 and feature2; 168 features). The best
features are then selected from each group with
sequential forward-floating search. The number of
most selected features varies from two to twenty-
three, depending on whether valence or arousal is
recognized and the parameters used in the
spectrogram analysis (Ferdinando, et al., 2016).
3.2 Dimensionality Reduction
The chosen DR methods, LDA, NCA, and MCML,
are applied to the selected features from certain
window size and overlap parameters combination in
the spectrogram analysis found in (Ferdinando, et al.,
2016) to get features with lower dimensionality. The
initial matrix
A for NCA and MCML are generated
with a random number generator. It means that there
is no guarantee that they provide the optimum matrix
A in one pass. The algorithm is modified to be
iterative such that it stops – a flag is set – when there
is no improvement, validated using leave-one-out
cross-validation, within 200 iterations. The DR is
applied only in cases when the number of selected
features is greater than the target dimensionality. The
optimum projection matrix
A is saved for further
process.
3.3 Classifier and Validation Methods
We used kNN classifier as in (Ferdinando, et al.,
2016) to solve the original 3-class classification
problem for valence and arousal. 20% of the data are
held out for validation while the rest are subject to 10-
fold cross validation. The classifier model is built
based on the projection of the selected features using
the optimum projection matrix
A during the DR
phase. The whole validation process is repeated 100
times with new resampling in each iteration. The
average over the repetition represent the final
accuracy. When the accuracies from different
combinations of window size and overlap parameter
are close to each other, the final accuracy is justified
using the Law of Large Numbers (LLN).
Another validation for the result is leave-one-
subject-out (LOSO) validation. The main idea is to
evaluate if the transformed features are general
enough to work well with features from new subjects.
4 RESULTS
Table 1 to 4 show the best results from each target
dimensionality of each DR algorithm with 10-fold
cross validation and 100 iterations.
Table 1: Accuracy after applying LDA DR for valence and
arousal.
Dimensionality Valence Arousal
2D 55.1 ± 7.4
59.9 ± 6.8
Since this is 3-class problem, the highest
dimensionality that the LDA can yield is two.
Surprisingly, the accuracy for both valence and
arousal are close to (Ferdinando, et al., 2016). An
improvement, however, is less storage and faster
calculation than standard kNN.
Table 2: Accuracy after applying NCA DR for valence and
arousal.
Dimensionality Valence Arousal
2D 61.3 ± 7.2 65.6 ± 6.2
3D 57.0 ± 8.0 66.0 ± 8.1
4D 65.3 ± 6.5 60.1 ± 7.7
5D 64.5 ± 6.7 61.0 ± 8.1
6D 53.2 ± 7.6 61.5 ± 7.5
7D 60.4 ± 6.6 61.2 ± 7.2
Results from the NCA for both valence and
arousal are promising, since the best accuracies for
valence and arousal are even higher than in
(Ferdinando, et al., 2016).
Table 3: Accuracy after applying MCML DR for valence
and arousal.
Dimensionality Valence Arousal
2D 54.5 ± 7.9 60.5 ± 7.5
3D 54.6 ± 7.4 48.9 ± 7.3
4D 41.8 ± 6.9 49.3 ± 7.2
5D 41.9 ± 7.2 49.3 ± 7.1
6D 42.1 ± 7.6 49.2 ± 7.0
7D 43.5 ± 7.3 48.4 ± 8.9
The best results based on the MCML DR from
both valence and arousal are close to the ones in
(Ferdinando, et al., 2016). It also results in less
Enhancing Emotion Recognition from ECG Signals using Supervised Dimensionality Reduction
115
storage for the data and faster computation than
standard kNN.
Table 4 compares the results among LDA, NCA,
and MCML side-by-side. It shows that the NCA
outperforms the other methods. The difference is
roughly 10% and 5% for valence and arousal,
respectively.
Table 4: Best accuracies of the dimensionality reduction
methods.
LDA NCA MCML
Valence
55.1 ± 7.4 65.3 ± 6.5
(4D)
64.5 ± 6.7
(5D)
54.6 ± 7.4
(3D)
54.5 ± 7.9
(2D)
Arousal
59.9 ± 6.8 66.0 ± 8.1
(3D)
65.6 ± 6.2
(2D)
60.5 ± 7.5
(2D)
Since the most promising results in some cells in
Table 4 are close to each other, the Law of Large
Numbers is used to estimate accuracies as close as
possible to the true ones. After 1000 iterations, the
best results are in Table 5.
Table 5: Applying LLN based on Table 4.
LDA NCA MCML
Valence 54.2 ± 7.4 64.1 ± 7.4
(4D)
53.6 ± 7.3
(3D)
Arousal 59.8 ± 7.3 66.1 ± 7.4
(3D)
59.5 ± 7.1
(2D)
It is obvious that the NCA method outperforms
the others. The rest of the experiments are related to
LOSO validation. Table 6 to 8 summarizes these
experiments for valence and arousal.
Table 6: Accuracy after applying LDA DR for valence and
arousal in LOSO validation.
Dimensionality Valence Arousal
2D 56.5 ± 10.7
60.6 ± 9.1
The accuracies for both valence and arousal based
on LOSO validation reveal the same pattern as in 10-
fold cross validation (see Table 1), i.e. accuracy for
arousal is higher than that for valence. These
accuracies are also close to ones in Table 1. For
valence, the result came from the same window size
and overlap parameters in the spectrogram analysis,
but not for arousal.
By comparing Table 2 and Table 7, one can
observe that the best result from arousal came from
the same dimensionality. Looking into detail of the
experiments, one finds out that the best result also
came using the same window size and overlap
parameters in the spectrogram analysis. However, the
valence did not show this pattern.
Table 7: Accuracy after applying NCA DR for valence and
arousal in LOSO validation.
Dimensionality Valence Arousal
2D 61.7 ± 14.1 69.6 ± 12.4
3D 59.4 ± 11.6 51.1 ± 9.5
4D 44.0 ± 12.0 53.3 ± 11.0
5D 40.1 ± 12.0 47.3 ± 11.9
6D 40.0 ± 13.0 51.5 ± 8.6
7D 38.7 ± 11.1 45.7 ± 12.3
Table 8: Accuracy after applying MCML DR for valence
and arousal in LOSO validation.
Dimensionality Valence Arousal
2D 55.9 ± 9.3 61.7 ± 12.3
3D 56.3 ± 12.1 50.2 ± 9.8
4D 41.9 ± 10.6 50.2 ± 10.0
5D 38.8 ± 10.6 50.5 ± 10.4
6D 39.3 ± 11.0 50.3 ± 10.5
7D 39.1 ± 10.8 48.4 ± 8.9
Similar to the NCA result, the accuracy for
arousal also came from the same dimensionality and
parameters of the spectrogram analysis, but not for
valence.
Table 9: Accuracies of all dimensionality reduction
methods in LOSO validation.
LDA NCA MCML
Valence 56.5 ± 10.7 61.7 ± 14.1 56.3 ± 12.1
Arousal 60.6 ± 9.1 69.6 ± 12.4 61.7 ± 12.3
Significance assessment was performed using t-
test with significance level 0.05 for valence between
LDA and NCA methods. The p-value was 0.035
indicating that NCA was superior to LDA. For
arousal, the test showed (p-value 0.0016) that NCA
was superior to MCML.
5 DISCUSSION
As mentioned in the Supervised Dimensionality
Reduction section, DR with the LDA has a limitation
that it can only reduce the dimensionality to a number
not higher than the number of the classes. The other
algorithms can try to search for any dimensionality as
long as it is smaller than the dimensionality of the
original feature space. With this limitation, the LDA
did not provide any improvement for the accuracy but
ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
116
can only save some storage space and computational
load.
The MCML, inspired by the NCA (Globerson and
Roweis, 2006), most of the time failed to find the
optimum projection matrix for the features. There
was no improvement to the accuracy of the system
compared to using the original feature set. It reduced
the dimensionality from four to three for valence and
from three to two for arousal. Similar to the LDA, the
contribution of the MCML is saving the storage space
slightly.
The NCA significantly improved the accuracy of
emotion recognition. The dimensionalities of the
feature set were reduced from twenty-three to four
and from twenty-two to three for valence and arousal,
respectively. For small number of samples, this might
be not significant but it will be different for the big
data analysis.
The result of this study is compared to the
accuracies from the previous study (Ferdinando, et
al., 2016), see Table 10.
Table 10: Comparison of results to a reference paper, 10-
fold cross validation.
Reference
(Ferdinando, et al.,
2016)
DR experiment
(NCA)
Valence 55.8 ± 7.3 64.1 ± 7.4 (4D)
Arousal 59.7 ± 7.0 66.1 ± 7.4 (3D)
We verify whether applying DR to features indeed
improves the accuracy of the system using t-test
method with significant level 0.05 and null
hypothesis that both are from the same distribution.
The p-values for both valence and arousal are close to
zero indicating that the improvements are significant.
Table 11: Comparison of results to a reference paper, LOSO
validation.
Original
(Ferdinando, et al.,
2016)
DR experiment
(NCA)
Valence 59.2 ± 11.4 61.7 ± 14.1
Arousal 58.7 ± 9.1 69.6 ± 12.4
We used t-test again to verify that applying DR
can improve the performance of the system in LOSO
validation with significant level 0.05. The p-values
were 0.1873 and 0.0001 for valence and arousal,
respectively, indicating that there is no significant
difference between the original and DR experiment
for valence but there is a significant improvement
with the arousal recognition.
During this study, the algorithms were modified
such that they are iterative with a simple stopping
criterion. Further studies related to iterative
algorithms is needed in order to get more benefits
from the supervised dimensionality reduction. It
might be possible also to investigate how to initialize
matrix
A without random number generator.
6 CONCLUSIONS
This paper explored supervised DR in emotion
recognition based on the Mahnob-HCI database. It
was shown that the supervised DR based on NCA
increased the accuracy from 55.8% to 64.1% and
from 59.7% to 66.1% for a 3-class problem in valence
and arousal respectively using 10-fold cross
validation. Compared to the initial baseline
(Ferdinando, et al., 2014), the accuracies improved
significantly by around 20%.
With LOSO validation, the supervised DR based
on NCA increased the accuracy of arousal recognition
from 58.7% to 69.6% for 3-class problem. However,
it failed to improve the accuracy for valence as
indicated by statistical significance test.
The generalisability of these results is subject to
certain limitations. For instances, the iterative
algorithm was very simple such that the whole system
failed to gain more benefits from the supervised
dimensionality reduction techniques. Another
important limitation is about matrix
A initialization
process which used random number generator. Using
a more sophisticated initialization might improve the
performance.
Among the three methods explored in this paper,
the NCA showed its superiority when it was applied
to the Mahnob-HCI database, although the MCML
was developed to improve the performance of the
NCA. Yet, it will be very interesting to explore the
same methods with other databases and various
applications in order to draw more comprehensive
conclusions for the supervised DR applied to emotion
recognition based on physiological signals.
ACKNOWLEDGEMENTS
This research was supported by by the Directorate
General of Higher Education, Ministry of Higher
Education and Research, Republic of Indonesia, No.
2142/E4.4/K/2013, the Finnish Cultural Foundation
North Ostrobothnia Regional Fund and the
Optoelectronics and Measurement Techniques unit,
University of Oulu, Finland.
Enhancing Emotion Recognition from ECG Signals using Supervised Dimensionality Reduction
117
REFERENCES
Agrafioti, F., Hatzinakos, D. & Anderson, A. K., 2012.
ECG Pattern Analysis for Emotion Detection. IEEE
Transactions on Affective Computing, 3(1), pp. 102-
115.
Ferdinando, H., Seppänen, T. & Alasaarela, E., 2016.
Comparing Features from ECG Pattern and HRV
Analysis for Emotion Recognition System. Chiang Mai,
Thailand, The annual IEEE International Conference on
Computational Intelligence in Bioinformatics and
Computational Biology (CIBCB 2016).
Ferdinando, H., Ye, L., Seppänen, T. & Alasaarela, E.,
2014. Emotion Recognition by Heart Rate Variability.
Australian Journal of Basic and Applied Sciences,
8(14), pp. 50-55.
Fratini, A., Sansone, M., Bifulco, P. & Cesarelli, M., 2015.
Individual identification via electrocardiogram
analysis. BioMedical Engineering OnLine, 14(78), pp.
1-23.
Globerson, A. & Roweis, S., 2006. Metric Learning by
Collapsing Classes. In: Y. Weiss & B. Schölkopf, eds.
Advances in Neural Information Processing Systems
18. Cambridge, MA: MIT Press, p. 451–458.
Goldberger, J., Roweis, S., Hinton, G. & Salakhutdinov, R.,
2005. Neighborhood Components Analysis. In: L. K.
Saul, Y. Weiss & L. Bottou, eds. Advances in Neural
Information Processing System Vol. 17. Cambridge:
MIT Press, p. 513–520.
Jolliffe, I., 2002. Principal Component Analysis. 2 ed. New
York: Springer Verlag.
Labiak, J. & Livescu, K., 2011. Nearest Neighbors with
Learned Distances for Phonetic Frame Classification.
Florence, Italy., International Speech Communication
Association (ISCA).
Lee, J. A. & Verleysen, M., 2010. Unsupervised
Dimensionality Reduction: Overview and Recent
Advances. Barcelona, Spain, IEEE World Congress on
Computational Intelligence (WCCI) 2010.
McDuff, D. et al., 2012. AffectAura: an intelligent system
for emotional memory. New York, Association for
Computing Machinery (ACM).
McSharry, P. E., Clifford, G. D., Tarassenko, L. & Smith,
L. A., 2003. A Dynamical Model of Generating
Synthetic Electrocardiogram Signals. IEEE
Transactions on Biomedical Engineering, 50(3), pp.
289-294.
Rilling, G., Flandrin, P., Gonçalves, P. & Lilly, J. M., 2007.
Bivariate Empirical Mode Decomposition. IEEE Signal
Processing Letters, 14(12), pp. 936-939.
Romero, J., Diago, L., Shinoda, J. & Hagiwara, I., 2015.
Comparison of Data Reduction Methods for the
Analysis of Iyashi Expressions using Brain Signals.
Journal of Advanced Simulation in Science and
Engineering, 2(2), pp. 349-366.
Sammon, J. W., 1969. A nonlinear mapping algorithm for
data structure analysis. EEE Transactions on
Computers, CC-18(5), pp. 401-409.
Soleymani, M., Lichtenauer, J., Pun, T. & Pantic, M., 2012.
A Multimodal Database for Affect Recognition and
Implicit Tagging. IEEE Transactions on Affective
Computing, 3(1), pp. 1-14.
Sugiyama, M., 2007. Dimensionality Reduction of
Multimodal Labeled Data by Local Fisher Discriminant
Analysis. Journal of Machine Learning Research,
Volume 8, pp. 1027-1061.
Valenzi, S., Islam, T., Jurica, P. & Cichocki, A., 2014.
Individual Classification of Emotions Using EEG.
Journal of Biomedical Science and Engineering,
Volume 7, pp. 604-620.
van der Maaten, L., 2016. Matlab Toolbox for
Dimensionality Reduction - Laurens van der Maaten.
[Online]
Available at: https://lvdmaaten.github.io/drtoolbox/
[Accessed 28 7 2016].
Weinberger, K. Q., Blitzer, J. & Saul, L. K., 2005. Distance
Metric Learning for Large Margin Nearest Neighbor
Classification. Advances in Neural Information
Processing System, Volume 18, p. 1473–1480.
Weinberger, K. Q. & Saul, L. K., 2009. Distance Metric
Learning for Large Margin Nearest Neighbor
Classification. Journal of Machine Learning Research,
Volume 10, pp. 207-244.
Zhang, S. & Zhao, X., 2013. Dimensionality reduction-
based spoken emotion recognition. Multimedia Tools
and Applications, 63(3), p. 615–646.
ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
118