Clustering of Gabor Atoms Describing Event-Related Potentials
Solution for ERP Detection Algorithm based on Matching Pursuit when ERP
Waveform is Approximated by Two or More Gabor Atoms
Tomas Rondik and Pavel Mautner
Department of Computer Science and Engineering, University of West Bohemia, Univerzitni 8, Pilsen, Czech Republic
Keywords: SOM, Self-organizing Map, Kohonen Neural Network, ANN, Artificial Neural Network, Clustering, MP,
Matching Pursuit Algorithm, Gabor Atoms, CCL, Connected-Component Labeling, ERP, Event-Related
Potential, P3 Component, EEG, Electroencephalography.
Abstract: In our research group, we also focus on methods for automatic detection of event-related potentials in the
EEG signal. We published the algorithm for event-related potential detection based on the matching pursuit
algorithm in one of our previous papers. As usual, this method does not work well under special
circumstances which can occur (it is a situation when the waveform of event-related potential is
approximated by more than one function from the matching pursuit base functions dictionary). This paper
introduces solution of this issue which is based on the self-organizing map and the connected-component
labeling algorithm (it allows to group the functions related to a one kind of event-related potential to a
cluster - this should prevent the detection algorithm based on matching pursuit from the fault described
above).
1 INTRODUCTION
There is a well-known phenomenon which is related
to the EEG domain – event-related potentials
(ERPs). ERPs are waveforms with specific
frequencies, latencies, and polarities (note that there
is a special family of ERPs which has alternating
polarity) independent of EEG activity.
Unfortunately, amplitude of the EEG signal is about
ten times greater than amplitude of the strongest
ERP waveform. If we assume that EEG is only a
noise (as well as interferences from surrounding
electromagnetic field and signals of non-cerebral
origin: eye movement, muscles activity, EKG
activity, etc.), then the signal to noise ratio is very
low.
This is the main reason why it is not trivial for
neuroscientists to recognize typical ERP waveforms
which appear in the electroencephalogram. Of
course, it is even more complicated for automatic
detection algorithms.
There are a lot of methods which can be used for
ERPs detection. One of these methods is the
matching pursuit (MP) algorithm that decomposes
the EEG/ERP signal to atoms – functions chosen
from a dictionary of base functions. Svoboda,
Mautner and Moucek (2008) demonstrated its ability
to detect the P3 waveform. When they detect an
ERP component in the EEG/ERP signal, they look
for an atom which looks like the ERP waveform.
However, there is a problem which can cause the
detection fails. This situation occurs when the ERP
waveform is decomposed into more than one atom.
In this case, no of these atoms approximates the ERP
waveform well enough.
In this paper, we introduce an approach to avoid
of this fail in detection. The approach is based on
identification of all atoms describing the ERP
waveform.
The paper is organized as follows: Section 2 is a
brief introduction to the ERP domain. It explains
what ERPs are and which ERP is decided to be used
in this paper. Section 3 introduces the MP algorithm
and the principle of the ERPs detection method
which was introduced by Svoboda et al. (2008) as
well as the problem with decomposition of ERP
waveform into multiple atoms. The method for
identification of atoms which approximate the ERP
waveform is described in Section 4. The
methodology for evaluation its ability to identify
ERP waveforms is described in Section 5. Last two
309
Rondik T. and Mautner P..
Clustering of Gabor Atoms Describing Event-Related Potentials - Solution for ERP Detection Algorithm based on Matching Pursuit when ERP Waveform
is Approximated by Two or More Gabor Atoms.
DOI: 10.5220/0004189203090314
In Proceedings of the International Conference on Health Informatics (HEALTHINF-2013), pages 309-314
ISBN: 978-989-8565-37-2
Copyright
c
2013 SCITEPRESS (Science and Technology Publications, Lda.)
sections contain a summary of results and
conclusion.
2 EVENT-RELATED
POTENTIALS
An event-related potential is any measurable brain
response that is a direct result of a specific sensory,
cognitive, or motor event. More formally, it is any
stereotyped electrophysiological response to a
stimulus (Luck, 2005). This brain response is
characterized by its amplitude, frequency, and
latency (time between stimulus and
maximum/minimum value of the waveform).
We decided to use the ERP P3 – visual event-
related potential - the third positive wave (see Figure
1) in our experiment. The P3 waveform occurs only
if the subject is actively engaged in the task of
detecting the targets. Its amplitude varies with the
improbability of the targets. Its latency varies with
the difficulty of discriminating the target stimulus
from the standard stimuli (Picton, 1992).
Figure 1: Typical amplitudes, frequencies, latencies, and
waveforms of some ERPs (Luck, 2005). In this figure the
axis of functional values grows downwards.
The P1, N1, P2, and N2 waveforms are related to
sensory perception of stimuli and they are not
important for this paper. A detailed description of
ERPs was given by Luck (2005).
3 MATCHING PURSUIT
ALGORITHM
The MP algorithm decomposes any signal to atoms,
which are selected from a dictionary. The atom that
approximates the input signal most closely is chosen
during each iteration. This atom is subtracted from
the input signal and the residue enters the next
iteration of the algorithm. The total sum of atoms
selected successively in algorithm iterations is an
approximation of the original signal – the more
iterations we do, the more accurate approximation
we get (Rondik and Ciniburk, 2011). The difference
between the input signal and its approximation
converges to zero with the increasing number of MP
iterations.
The MP algorithm is most often associated with
the Gabor atoms dictionary. Gabor atoms are
defined as the Gaussian window:


(1)
modulated using the cosine function as follows:

,,,


∙cos

(2)
Each atom is uniquely defined by the ordered
quadruple (s,u,v,w), where s means scale, u is shift, v
is frequency, and w is the phase shift. In our
experiment, we used the MP algorithm implemented
according to Ferrando et al., (2002).
For visualization of Gabor atoms the Wigner-
Ville transform is often used – it shows time-
frequency energy density. A detailed description was
given by Durka (2007).
3.1 Using MP Algorithm ERP
Detection
The idea published by Svoboda et al. (2008) is
composed of two basic steps:
1. Decomposition of the input EEG/ERP signal into
a few Gabor atoms.
2. Selection of the atom (from a set of Gabor
atoms) which corresponds to the detected ERP.
It means that correlation between the atom and the
input EEG/ERP signal (the value of correlation is
often called modulus) must be higher than
a threshold.
There are two different examples of the P3
component detection in Figure 2. On the left half of
the figure, the favorable situation is shown - the P3
waveform is approximated by one Gabor atom only
and the value of the correlation between this atom
and the input signal is high enough to pass the
threshold. On the right half of the figure, the
unfavorable situation is shown. The P3 waveform is
partially approximated by the first and third Gabor
atom. The value of the correlation between the
EEG/ERP signal and both the first and third Gabor
atom is not high enough to pass the threshold. It
leads to a false negative result during detection.
HEALTHINF2013-InternationalConferenceonHealthInformatics
310
If we could select all atoms which partially
approximate the ERP waveform, calculate the vector
sum of these atoms and consider this vector sum as
a new atom, we would be able to detect the ERP
waveform successfully.
The solution is to use an algorithm which
categorizes Gabor atoms into groups in such a way
that atoms in each group are similar to each other.
Once we have these groups, we can manually mark
the groups which contain atoms which can
approximate (or partially approximate) ERP
waveforms.
Figure 2: The favorable decomposition is shown on the
left half and the unfavorable decomposition is shown on
the right. In order from top to bottom: the input signal with
the P3 waveform; the first, second, and third Gabor atom;
visualization of Gabor atoms by the Wigner-Ville
transform.
4 WAVEFORMS
CATEGORIZATION
We decided to solve the issue shown on Figure 2
with self-organizing map (SOM), because:
According to Wann and Thomopoulos (1997),
SOM is a suitable neural network for data clustering.
It has an unsupervised learning algorithm. This
method of learning is exactly what we need because
unsupervised learning clustering methods are
applied when the classification of a given set of
sample patterns is unknown (Wann and
Thomopoulos, 1997).
4.1 SOM Topology
There are N kinds of SOM topology where N is
a dimension of a space where neurons are
equidistantly placed. N is an integer value from the
interval <1; ). We decided to choose a two-
dimensional organization of neurons – the most
common solution. In the following text, we consider
the SOM as a two-dimensional map of neurons.
4.2 Neighborhood Radius during
Learning
In the SOM, the winner weight vector and the
weight vectors of all neurons in its neighborhood are
modified during the learning process. In our
implementation, we used square neighborhood
which radius is defined as follows:


2
(3)
where α is a learning rate parameter, b is the base of
exponential lost (the radius decreases with each next
training pattern exponentially), and p is a current
learning progress (


where done is the
number of training patterns which were already used
and all is the number of all training patterns).
4.3 Merging Neurons into Clusters
As a result of the learning algorithm we get
a specific weight vector for each neuron. It would be
worth to have the set of weight vectors such that
only one neuron would be marked as a winner for all
similar atoms. Unfortunately, it doesn’t work in the
case of the SOM because the selected atom weight
vector and also weight vectors of all atoms in its
neighborhood are updated during the learning
process (see Section 4.2).
If we want to have all neurons with the similar
weight vector in one cluster, we need a method to
recognize these neurons and consider them as one
cluster. At first, it is necessary to choose a metric for
weight similarity. We decided to use a well-known
method for measuring signal similarity – correlation.
Equation (4) shows computing of correlation
between two signals x and y:

,
∑

̅



∑

̅

∑


(4)
ClusteringofGaborAtomsDescribingEvent-RelatedPotentials-SolutionforERPDetectionAlgorithmbasedon
MatchingPursuitwhenERPWaveformisApproximatedbyTwoorMoreGaborAtoms
311
where ̅

and 

.
4.3.1 Weight Vectors Similarity
Visualization
For better understanding, look at visualization of
similarity between neuron weights where each
neuron is shown as a 3x3 matrix. On the index [i-1,
j-1] is the value of correlation between the neuron on
the index [i, j] and the index [i-1, j-1], etc. Note that
there is always zero on the index [i, j]. For
visualization, the values of the correlation result are
recalculated from the interval of real values <-1, 1>
to the interval of integer values <0, 255> (the gray
scale).
Figure 3: Mask for visualization of weights similarity
between neurons.
According to the description given above,
visualization of weights similarity of neurons looks
as shown in Figure 4.
Figure 4: Neuron weights similarity in a two-dimensional
map with 100 neurons.
It is easy to see clusters in Figure 4, but we need
to implement an algorithm which is able to find
these clusters as well. As a solution, we used an
algorithm which is well-known in computer vision –
connected-component labeling.
4.3.2 Connected-Component Labeling
Connected-component labeling (CCL) is a two-pass
algorithm. It uses a map of neurons as an input. In
the first pass, CCL iterates through each neuron by
row. The neighboring neurons are given by a mask:
Figure 5: Mask which defines neighboring neurons during
the first pass of the CCL algorithm.
According to correlation between the weight
vector of the neuron [i, j] and weight vectors of
neighboring neurons three situations can occur:
1. If correlation with all neighboring neurons is too
low to be in the same cluster with neuron on the
index [i, j] then a new cluster number is set to
neuron on the index [i, j].
2. If correlation of just one of neighboring neurons
is high enough to be in the same cluster as neuron on
the index [i, j] then the neuron on position [i, j] is
put to the same cluster.
3. If correlation of more than one of neighboring
neurons is high enough to be in the same cluster as
neuron on the index [i, j one of them is randomly
selected and the neuron on position [i, j] is put to the
same cluster. If neighboring neurons with high
enough correlation value belong to different clusters,
save that these clusters are equivalent to a special
data structure.
In the second pass, CCL iterates through each
neuron by row and gets rid of equivalent cluster
numbers for one cluster using the special data
structure from the first pass. After the second pass,
each cluster is signed with just one cluster number
(even if the cluster consists of one neuron only).
4.4 Suitable Feature Vector
Selection of a suitable feature vector is a critical
decision for further successful clustering. There are
no exact or universal rules to identify an optimal
feature vector (see Lotte et al., (2007); Pradhan et
al., (1996) and Gotman and Wang (1991) for
examples of feature vectors used in the EEG
domain).
We decided to use the following feature vectors
based on Gabor atoms (the result of the MP
algorithm):
Functional values of the Gabor atom (full length).
Functional values of the Gabor atom subsampled
to 32 samples.
HEALTHINF2013-InternationalConferenceonHealthInformatics
312
5 EXPERIMENT
METHODOLOGY
5.1 Test Data
Our test data were obtained during the experiment
based on oddball paradigm (originally designed by
Squires et al. (1975), where the white O character on
a black background shown on the full screen was the
non-target stimulus and the white Q character on
a black background shown on the full screen was the
target stimulus. Brain activity for the Fz, Cz, and Pz
electrode (see 10-20 system in Niedermeyer and
Lopes da Silva (2004)) was sampled with 1 kHz
frequency (this frequency is sufficient according to
Nyquist-Shannon sampling theorem). Frequencies
higher than 45 Hz were cut off automatically. Then
the electroencephalogram was split into epochs
(each epoch starts with stimulus onset and takes 512
ms (because of used MP implementation) which
means – according to sampling frequency - 512
samples per epoch), the baseline was corrected, and
epochs were divided into target and non-target ones.
We decomposed all target epochs to five Gabor
atoms using the MP algorithm. According to Rondik
and Ciniburk (2011), five atoms are sufficient for
our purposes. The whole test set was composed of
359 Gabor atoms.
5.2 Initial Setup of SOM
We tested all feature extraction methods based on
Gabor atoms described in Section 4.4. We used the
following initial SOM setting:
A two-dimensional neural network with 100
neurons
learning rate is equal to 0.7 (see Section 4.2)
initial values of weight vectors were randomly
chosen from the interval of real values <0, 1)
5.3 SOM Learning
The neural network was learned with all feature
vectors. At the end of the learning procedure, we
obtained a neuron index for each feature vector.
Because we know which feature vector is related to
which Gabor atom, we can assign each Gabor atom
to a specific neuron.
5.4 Clusters Definition
At this point, we assign neurons to clusters using the
CCL algorithm (the clustering threshold was set to
0.8). These clusters are inputs for the clustering
quality evaluation method. We need to evaluate
quality of clusters to be sure that clustering via SOM
and CCL works well.
5.5 Clustering Quality Evaluation
For evaluation of clustering quality we need a
measure which can compute similarity between
Gabor atoms in each cluster. For measuring the
similarity of all signals in one cluster we used:

1
1

,
;


(5)
where C
i
is i
th
cluster and x
i
is the set of all signals in
the cluster C
i
.
6 RESULTS
Actually we could assume that we are interested in
clustering quality for clusters which contain
waveforms which approximate (or partially
approximate) P3 waveforms only. The clustering
quality of other waveforms which do not describe
the P3 waveform is – from the MP algorithm
detection issue point of view - not significant.
Looking at average similarities in Table 1 it can
be easily seen that there is no significant difference
between clustering quality of P3 waveforms clusters
and all clusters.
Table 1: Comparison of clusters quality with respect to
used feature vector.
Feature vector
All clusters ERP clusters only
Average
similarity per
cluster
Number of
averaged
clusters
Average
similarity per
cluster
512 samples 0.5804 6 0,4851
32 samples 0.5694 6 0,5567
In Figure 6 the clusters which are related to
Gabor atoms which approximate (or partially
approximate) P3 waveform are highlighted (the
highlighting was done manually). Other clusters
belong to waveforms which are not related to P3
waveforms.
7 CONCLUSIONS
Let us note that this is the first experimental result.
Taking this into account, the results cannot be
ClusteringofGaborAtomsDescribingEvent-RelatedPotentials-SolutionforERPDetectionAlgorithmbasedon
MatchingPursuitwhenERPWaveformisApproximatedbyTwoorMoreGaborAtoms
313
considered as statistically significant. However, it
shows us, that the way we decided to solve the
problem with ERPs detection which affects method
described in Svoboda et al. (2008), may be right.
Figure 6: Neuron weights similarity in a two-dimensional
map with 100 neurons with manually highlighted clusters
which are related to Gabor atoms which approximate ERP
P3 waveform.
Looking at results given in Table 1, it does not
matter which of two feature vectors presented in this
paper will be used. The only difficulty which affects
the described method is that clusters which
approximate (or partially approximate) ERP
waveforms must be marked manually by an expert.
In the future, we will use the proposed method in
ERP detection algorithm based on MP to prove, that
this method can improve the reliability of ERPs on a
statistically significant level.
ACKNOWLEDGEMENTS
The work was supported by the UWB grant SGS-
2010-038 Methods and Applications of Bio- and
Medical Informatics and by the European Regional
Development Fund (ERDF), Project "NTIS - New
Technologies for Information Society", European
Centre of Excellence, CZ.1.05/1.1.00/02.0090.
REFERENCES
Luck, S. J., (2005). An Introduction to the Event-Related
Potential Technique. Cambridge: The MIT Press.
Rondik, T. and Ciniburk, J., (2011). Comparison of
Various Approaches for P3 Component Detection
Using Basic Methods for Signal Processing. 4th
International Conference on Biomedical Engineering
and Informatics (BMEI), 700 - 704, New York: IEEE.
Picton, T. W., (1992). The p300 wave of the human event-
related potential. Journal of Clinical Neurophysiology,
9, 456-479.
Svoboda, J., Mautner, P. and Moucek, R., (2008).
Detection of ERP using Matching Pursuit Algorithm.
Xth International conference on cognitive
neuroscience, 226, Istanbul.
Ferrando, S. E., Kolasa, L. A. and Kovacevic, N., (2002).
Algorithm 820: A Flexible Implementation of
Matching Pursuit for Gabor Functions on the Interval.
ACM Transactions on Mathematical Software, 28(3),
2002, 337-353.
Wann, C. D. and Thomopoulos, S. C. A., (1997). A
Comparative Study of Self-organizing Clustering
Algorithms Dignet and ART2. Neural Networks,
10(4), 737–753.
Lotte, F., Congedo, M., Lecuyer, A., Lamarche, F. and
Arnaldi, B., (2007). A Review of Classification
Algorithms for EEG-based Brain-Computer Interfaces.
Journal of Neural Engineering, 4.
Pradhan, N., Sadasivan, P. K. and Arunodaya, G. R.,
(1996). Detection of seizure activity in EEG by an
artificial neural network: A preliminary study.
Computers and Biomedical Research, 29, 303–313.
Gotman, J. and Wang, L., (1991). State-dependent spike
detection: Concepts and preliminary results.
Electroencephalography and Clinical
Neurophysiology, 79(1), 11–19.
Squires, N. K., Squires, K. C. and Hillyard, S. A., (1975).
Two varieties of long-latency positive waves evoked
by unpredictable auditory stimuli in man.
Electroencephalography and Clinical
Neurophysiology , 38(4), 387-401.
Niedermeyer, E. and Lopes da Silva, F., (2004).
Electroencephalography: Basic Principles, Clinical
Applications, and Related Fields. Philadelphia:
Lippincott Williams & Wilkins.
Durka, P. J., (2007). Matching pursuit. Retrieved April 22,
2010, from: http://www.scholarpedia.org/article/
Matching_pursuit.
HEALTHINF2013-InternationalConferenceonHealthInformatics
314