Clustering of Gabor Atoms Describing Event-Related Potentials

Solution for ERP Detection Algorithm based on Matching Pursuit when ERP

Waveform is Approximated by Two or More Gabor Atoms

Tomas Rondik and Pavel Mautner

Department of Computer Science and Engineering, University of West Bohemia, Univerzitni 8, Pilsen, Czech Republic

Keywords: SOM, Self-organizing Map, Kohonen Neural Network, ANN, Artificial Neural Network, Clustering, MP,

Matching Pursuit Algorithm, Gabor Atoms, CCL, Connected-Component Labeling, ERP, Event-Related

Potential, P3 Component, EEG, Electroencephalography.

Abstract: In our research group, we also focus on methods for automatic detection of event-related potentials in the

EEG signal. We published the algorithm for event-related potential detection based on the matching pursuit

algorithm in one of our previous papers. As usual, this method does not work well under special

circumstances which can occur (it is a situation when the waveform of event-related potential is

approximated by more than one function from the matching pursuit base functions dictionary). This paper

introduces solution of this issue which is based on the self-organizing map and the connected-component

labeling algorithm (it allows to group the functions related to a one kind of event-related potential to a

cluster - this should prevent the detection algorithm based on matching pursuit from the fault described

above).

1 INTRODUCTION

There is a well-known phenomenon which is related

to the EEG domain – event-related potentials

(ERPs). ERPs are waveforms with specific

frequencies, latencies, and polarities (note that there

is a special family of ERPs which has alternating

polarity) independent of EEG activity.

Unfortunately, amplitude of the EEG signal is about

ten times greater than amplitude of the strongest

ERP waveform. If we assume that EEG is only a

noise (as well as interferences from surrounding

electromagnetic field and signals of non-cerebral

origin: eye movement, muscles activity, EKG

activity, etc.), then the signal to noise ratio is very

low.

This is the main reason why it is not trivial for

neuroscientists to recognize typical ERP waveforms

which appear in the electroencephalogram. Of

course, it is even more complicated for automatic

detection algorithms.

There are a lot of methods which can be used for

ERPs detection. One of these methods is the

matching pursuit (MP) algorithm that decomposes

the EEG/ERP signal to atoms – functions chosen

from a dictionary of base functions. Svoboda,

Mautner and Moucek (2008) demonstrated its ability

to detect the P3 waveform. When they detect an

ERP component in the EEG/ERP signal, they look

for an atom which looks like the ERP waveform.

However, there is a problem which can cause the

detection fails. This situation occurs when the ERP

waveform is decomposed into more than one atom.

In this case, no of these atoms approximates the ERP

waveform well enough.

In this paper, we introduce an approach to avoid

of this fail in detection. The approach is based on

identification of all atoms describing the ERP

waveform.

The paper is organized as follows: Section 2 is a

brief introduction to the ERP domain. It explains

what ERPs are and which ERP is decided to be used

in this paper. Section 3 introduces the MP algorithm

and the principle of the ERPs detection method

which was introduced by Svoboda et al. (2008) as

well as the problem with decomposition of ERP

waveform into multiple atoms. The method for

identification of atoms which approximate the ERP

waveform is described in Section 4. The

methodology for evaluation its ability to identify

ERP waveforms is described in Section 5. Last two

309

Rondik T. and Mautner P..

Clustering of Gabor Atoms Describing Event-Related Potentials - Solution for ERP Detection Algorithm based on Matching Pursuit when ERP Waveform

is Approximated by Two or More Gabor Atoms.

DOI: 10.5220/0004189203090314

In Proceedings of the International Conference on Health Informatics (HEALTHINF-2013), pages 309-314

ISBN: 978-989-8565-37-2

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

sections contain a summary of results and

conclusion.

2 EVENT-RELATED

POTENTIALS

An event-related potential is any measurable brain

response that is a direct result of a specific sensory,

cognitive, or motor event. More formally, it is any

stereotyped electrophysiological response to a

stimulus (Luck, 2005). This brain response is

characterized by its amplitude, frequency, and

latency (time between stimulus and

maximum/minimum value of the waveform).

We decided to use the ERP P3 – visual event-

related potential - the third positive wave (see Figure

1) in our experiment. The P3 waveform occurs only

if the subject is actively engaged in the task of

detecting the targets. Its amplitude varies with the

improbability of the targets. Its latency varies with

the difficulty of discriminating the target stimulus

from the standard stimuli (Picton, 1992).

Figure 1: Typical amplitudes, frequencies, latencies, and

waveforms of some ERPs (Luck, 2005). In this figure the

axis of functional values grows downwards.

The P1, N1, P2, and N2 waveforms are related to

sensory perception of stimuli and they are not

important for this paper. A detailed description of

ERPs was given by Luck (2005).

3 MATCHING PURSUIT

ALGORITHM

The MP algorithm decomposes any signal to atoms,

which are selected from a dictionary. The atom that

approximates the input signal most closely is chosen

during each iteration. This atom is subtracted from

the input signal and the residue enters the next

iteration of the algorithm. The total sum of atoms

selected successively in algorithm iterations is an

approximation of the original signal – the more

iterations we do, the more accurate approximation

we get (Rondik and Ciniburk, 2011). The difference

between the input signal and its approximation

converges to zero with the increasing number of MP

iterations.

The MP algorithm is most often associated with

the Gabor atoms dictionary. Gabor atoms are

defined as the Gaussian window:















(1)

modulated using the cosine function as follows:









,,,















∙cos







(2)

Each atom is uniquely defined by the ordered

quadruple (s,u,v,w), where s means scale, u is shift, v

is frequency, and w is the phase shift. In our

experiment, we used the MP algorithm implemented

according to Ferrando et al., (2002).

For visualization of Gabor atoms the Wigner-

Ville transform is often used – it shows time-

frequency energy density. A detailed description was

given by Durka (2007).

3.1 Using MP Algorithm ERP

Detection

The idea published by Svoboda et al. (2008) is

composed of two basic steps:

1. Decomposition of the input EEG/ERP signal into

a few Gabor atoms.

2. Selection of the atom (from a set of Gabor

atoms) which corresponds to the detected ERP.

It means that correlation between the atom and the

input EEG/ERP signal (the value of correlation is

often called modulus) must be higher than

a threshold.

There are two different examples of the P3

component detection in Figure 2. On the left half of

the figure, the favorable situation is shown - the P3

waveform is approximated by one Gabor atom only

and the value of the correlation between this atom

and the input signal is high enough to pass the

threshold. On the right half of the figure, the

unfavorable situation is shown. The P3 waveform is

partially approximated by the first and third Gabor

atom. The value of the correlation between the

EEG/ERP signal and both the first and third Gabor

atom is not high enough to pass the threshold. It

leads to a false negative result during detection.

HEALTHINF2013-InternationalConferenceonHealthInformatics

310

If we could select all atoms which partially

approximate the ERP waveform, calculate the vector

sum of these atoms and consider this vector sum as

a new atom, we would be able to detect the ERP

waveform successfully.

The solution is to use an algorithm which

categorizes Gabor atoms into groups in such a way

that atoms in each group are similar to each other.

Once we have these groups, we can manually mark

the groups which contain atoms which can

approximate (or partially approximate) ERP

waveforms.

Figure 2: The favorable decomposition is shown on the

left half and the unfavorable decomposition is shown on

the right. In order from top to bottom: the input signal with

the P3 waveform; the first, second, and third Gabor atom;

visualization of Gabor atoms by the Wigner-Ville

transform.

4 WAVEFORMS

CATEGORIZATION

We decided to solve the issue shown on Figure 2

with self-organizing map (SOM), because:

 According to Wann and Thomopoulos (1997),

SOM is a suitable neural network for data clustering.

 It has an unsupervised learning algorithm. This

method of learning is exactly what we need because

unsupervised learning clustering methods are

applied when the classification of a given set of

sample patterns is unknown (Wann and

Thomopoulos, 1997).

4.1 SOM Topology

There are N kinds of SOM topology where N is

a dimension of a space where neurons are

equidistantly placed. N is an integer value from the

interval <1; ∞). We decided to choose a two-

dimensional organization of neurons – the most

common solution. In the following text, we consider

the SOM as a two-dimensional map of neurons.

4.2 Neighborhood Radius during

Learning

In the SOM, the winner weight vector and the

weight vectors of all neurons in its neighborhood are

modified during the learning process. In our

implementation, we used square neighborhood

which radius is defined as follows:

 

∝





∙







(3)

where α is a learning rate parameter, b is the base of

exponential lost (the radius decreases with each next

training pattern exponentially), and p is a current

learning progress (





where done is the

number of training patterns which were already used

and all is the number of all training patterns).

4.3 Merging Neurons into Clusters

As a result of the learning algorithm we get

a specific weight vector for each neuron. It would be

worth to have the set of weight vectors such that

only one neuron would be marked as a winner for all

similar atoms. Unfortunately, it doesn’t work in the

case of the SOM because the selected atom weight

vector and also weight vectors of all atoms in its

neighborhood are updated during the learning

process (see Section 4.2).

If we want to have all neurons with the similar

weight vector in one cluster, we need a method to

recognize these neurons and consider them as one

cluster. At first, it is necessary to choose a metric for

weight similarity. We decided to use a well-known

method for measuring signal similarity – correlation.

Equation (4) shows computing of correlation

between two signals x and y:





,





∑























∑

















∑















(4)

ClusteringofGaborAtomsDescribingEvent-RelatedPotentials-SolutionforERPDetectionAlgorithmbasedon

MatchingPursuitwhenERPWaveformisApproximatedbyTwoorMoreGaborAtoms

311

where ̅ 





∑









and 





∑









4.3.1 Weight Vectors Similarity

Visualization

For better understanding, look at visualization of

similarity between neuron weights where each

neuron is shown as a 3x3 matrix. On the index [i-1,

j-1] is the value of correlation between the neuron on

the index [i, j] and the index [i-1, j-1], etc. Note that

there is always zero on the index [i, j]. For

visualization, the values of the correlation result are

recalculated from the interval of real values <-1, 1>

to the interval of integer values <0, 255> (the gray

scale).

Figure 3: Mask for visualization of weights similarity

between neurons.

According to the description given above,

visualization of weights similarity of neurons looks

as shown in Figure 4.

Figure 4: Neuron weights similarity in a two-dimensional

map with 100 neurons.

It is easy to see clusters in Figure 4, but we need

to implement an algorithm which is able to find

these clusters as well. As a solution, we used an

algorithm which is well-known in computer vision –

connected-component labeling.

4.3.2 Connected-Component Labeling

Connected-component labeling (CCL) is a two-pass

algorithm. It uses a map of neurons as an input. In

the first pass, CCL iterates through each neuron by

row. The neighboring neurons are given by a mask:

Figure 5: Mask which defines neighboring neurons during

the first pass of the CCL algorithm.

According to correlation between the weight

vector of the neuron [i, j] and weight vectors of

neighboring neurons three situations can occur:

1. If correlation with all neighboring neurons is too

low to be in the same cluster with neuron on the

index [i, j] then a new cluster number is set to

neuron on the index [i, j].

2. If correlation of just one of neighboring neurons

is high enough to be in the same cluster as neuron on

the index [i, j] then the neuron on position [i, j] is

put to the same cluster.

3. If correlation of more than one of neighboring

neurons is high enough to be in the same cluster as

neuron on the index [i, j one of them is randomly

selected and the neuron on position [i, j] is put to the

same cluster. If neighboring neurons with high

enough correlation value belong to different clusters,

save that these clusters are equivalent to a special

data structure.

In the second pass, CCL iterates through each

neuron by row and gets rid of equivalent cluster

numbers for one cluster using the special data

structure from the first pass. After the second pass,

each cluster is signed with just one cluster number

(even if the cluster consists of one neuron only).

4.4 Suitable Feature Vector

Selection of a suitable feature vector is a critical

decision for further successful clustering. There are

no exact or universal rules to identify an optimal

feature vector (see Lotte et al., (2007); Pradhan et

al., (1996) and Gotman and Wang (1991) for

examples of feature vectors used in the EEG

domain).

We decided to use the following feature vectors

based on Gabor atoms (the result of the MP

algorithm):

 Functional values of the Gabor atom (full length).

 Functional values of the Gabor atom subsampled

to 32 samples.

HEALTHINF2013-InternationalConferenceonHealthInformatics

312

5 EXPERIMENT

METHODOLOGY

5.1 Test Data

Our test data were obtained during the experiment

based on oddball paradigm (originally designed by

Squires et al. (1975), where the white O character on

a black background shown on the full screen was the

non-target stimulus and the white Q character on

a black background shown on the full screen was the

target stimulus. Brain activity for the Fz, Cz, and Pz

electrode (see 10-20 system in Niedermeyer and

Lopes da Silva (2004)) was sampled with 1 kHz

frequency (this frequency is sufficient according to

Nyquist-Shannon sampling theorem). Frequencies

higher than 45 Hz were cut off automatically. Then

the electroencephalogram was split into epochs

(each epoch starts with stimulus onset and takes 512

ms (because of used MP implementation) which

means – according to sampling frequency - 512

samples per epoch), the baseline was corrected, and

epochs were divided into target and non-target ones.

We decomposed all target epochs to five Gabor

atoms using the MP algorithm. According to Rondik

and Ciniburk (2011), five atoms are sufficient for

our purposes. The whole test set was composed of

359 Gabor atoms.

5.2 Initial Setup of SOM

We tested all feature extraction methods based on

Gabor atoms described in Section 4.4. We used the

following initial SOM setting:

 A two-dimensional neural network with 100

neurons

 learning rate is equal to 0.7 (see Section 4.2)

 initial values of weight vectors were randomly

chosen from the interval of real values <0, 1)

5.3 SOM Learning

The neural network was learned with all feature

vectors. At the end of the learning procedure, we

obtained a neuron index for each feature vector.

Because we know which feature vector is related to

which Gabor atom, we can assign each Gabor atom

to a specific neuron.

5.4 Clusters Definition

At this point, we assign neurons to clusters using the

CCL algorithm (the clustering threshold was set to

0.8). These clusters are inputs for the clustering

quality evaluation method. We need to evaluate

quality of clusters to be sure that clustering via SOM

and CCL works well.

5.5 Clustering Quality Evaluation

For evaluation of clustering quality we need a

measure which can compute similarity between

Gabor atoms in each cluster. For measuring the

similarity of all signals in one cluster we used:













1







,





;









(5)

where C

is i

cluster and x

is the set of all signals in

the cluster C

6 RESULTS

Actually we could assume that we are interested in

clustering quality for clusters which contain

waveforms which approximate (or partially

approximate) P3 waveforms only. The clustering

quality of other waveforms which do not describe

the P3 waveform is – from the MP algorithm

detection issue point of view - not significant.

Looking at average similarities in Table 1 it can

be easily seen that there is no significant difference

between clustering quality of P3 waveforms clusters

and all clusters.

Table 1: Comparison of clusters quality with respect to

used feature vector.

Feature vector

All clusters ERP clusters only

Average

similarity per

cluster

Number of

averaged

clusters

Average

similarity per

cluster

512 samples 0.5804 6 0,4851

32 samples 0.5694 6 0,5567

In Figure 6 the clusters which are related to

Gabor atoms which approximate (or partially

approximate) P3 waveform are highlighted (the

highlighting was done manually). Other clusters

belong to waveforms which are not related to P3

waveforms.

7 CONCLUSIONS

Let us note that this is the first experimental result.

Taking this into account, the results cannot be

ClusteringofGaborAtomsDescribingEvent-RelatedPotentials-SolutionforERPDetectionAlgorithmbasedon

MatchingPursuitwhenERPWaveformisApproximatedbyTwoorMoreGaborAtoms

313

considered as statistically significant. However, it

shows us, that the way we decided to solve the

problem with ERPs detection which affects method

described in Svoboda et al. (2008), may be right.

Figure 6: Neuron weights similarity in a two-dimensional

map with 100 neurons with manually highlighted clusters

which are related to Gabor atoms which approximate ERP

P3 waveform.

Looking at results given in Table 1, it does not

matter which of two feature vectors presented in this

paper will be used. The only difficulty which affects

the described method is that clusters which

approximate (or partially approximate) ERP

waveforms must be marked manually by an expert.

In the future, we will use the proposed method in

ERP detection algorithm based on MP to prove, that

this method can improve the reliability of ERPs on a

statistically significant level.

ACKNOWLEDGEMENTS

The work was supported by the UWB grant SGS-

2010-038 Methods and Applications of Bio- and

Medical Informatics and by the European Regional

Development Fund (ERDF), Project "NTIS - New

Technologies for Information Society", European

Centre of Excellence, CZ.1.05/1.1.00/02.0090.

REFERENCES

Luck, S. J., (2005). An Introduction to the Event-Related

Potential Technique. Cambridge: The MIT Press.

Rondik, T. and Ciniburk, J., (2011). Comparison of

Various Approaches for P3 Component Detection

Using Basic Methods for Signal Processing. 4th

International Conference on Biomedical Engineering

and Informatics (BMEI), 700 - 704, New York: IEEE.

Picton, T. W., (1992). The p300 wave of the human event-

related potential. Journal of Clinical Neurophysiology,

9, 456-479.

Svoboda, J., Mautner, P. and Moucek, R., (2008).

Detection of ERP using Matching Pursuit Algorithm.

Xth International conference on cognitive

neuroscience, 226, Istanbul.

Ferrando, S. E., Kolasa, L. A. and Kovacevic, N., (2002).

Algorithm 820: A Flexible Implementation of

Matching Pursuit for Gabor Functions on the Interval.

ACM Transactions on Mathematical Software, 28(3),

2002, 337-353.

Wann, C. D. and Thomopoulos, S. C. A., (1997). A

Comparative Study of Self-organizing Clustering

Algorithms Dignet and ART2. Neural Networks,

10(4), 737–753.

Lotte, F., Congedo, M., Lecuyer, A., Lamarche, F. and

Arnaldi, B., (2007). A Review of Classification

Algorithms for EEG-based Brain-Computer Interfaces.

Journal of Neural Engineering, 4.

Pradhan, N., Sadasivan, P. K. and Arunodaya, G. R.,

(1996). Detection of seizure activity in EEG by an

artificial neural network: A preliminary study.

Computers and Biomedical Research, 29, 303–313.

Gotman, J. and Wang, L., (1991). State-dependent spike

detection: Concepts and preliminary results.

Electroencephalography and Clinical

Neurophysiology, 79(1), 11–19.

Squires, N. K., Squires, K. C. and Hillyard, S. A., (1975).

Two varieties of long-latency positive waves evoked

by unpredictable auditory stimuli in man.

Electroencephalography and Clinical

Neurophysiology , 38(4), 387-401.

Niedermeyer, E. and Lopes da Silva, F., (2004).

Electroencephalography: Basic Principles, Clinical

Applications, and Related Fields. Philadelphia:

Lippincott Williams & Wilkins.

Durka, P. J., (2007). Matching pursuit. Retrieved April 22,

2010, from: http://www.scholarpedia.org/article/

Matching_pursuit.

HEALTHINF2013-InternationalConferenceonHealthInformatics

314