SENN: Self-evolving Neural Network to Recognize Motor Imagery

Thought Patterns

Stuti Chug

a

and Vandana Agarwal

b

Department of Computer Science and Information Systems, BITS Pilani, Pilani Campus, India

Keywords:

Brain Computer Interface, Clustering, Radial Basis Function Neural Network, Particle Swarm Optimization.

Abstract:

The EEG-based motor imagery task classiﬁcation has been a challenge for researchers due to the complex

nature of EEG data. Human thoughts are a complex combination of different body limb activations and it

is difﬁcult to capture only one thought at a time. The data belonging to different motor imagery thought

classes are also not separable linearly. In this paper, a novel technique for efﬁcient and improved motor

imagery task classiﬁcation is proposed. Two major issues in motor imagery task classiﬁcation of EEG data

are addressed - channel selection and radial basis function neural network centers. The channel selection

is posed as a combinatorial problem and an evolutionary nature-inspired algorithm PSOCS is proposed to

select the most informative and discriminative channels using the Particle Swarm Optimization algorithm.

The features are extracted using the selected channels and are subjected to classiﬁcation. In this paper, a self-

evolving radial basis functions neural network (SENN) is proposed based on sub-clusters within each motor

imagery task class. The number, centers, and spread of hidden neurons are obtained by the k-means clustering

algorithm. The proposed algorithm is validated using the benchmarked datasets BCI Competition IV 2a and

BCI Competition IV 2b data set. The proposed technique outperforms some of the existing techniques and

classiﬁes the motor imagery tasks efﬁciently.

1 INTRODUCTION

The thought patterns are captured using a vari-

ety of sensors electroencephalogram (EEG), func-

tional magnetic resonance imaging (fMRI), magne-

toencephalography (MEG), and electrocorticography

(ECoG). These signals are non-stationary and suffer

due to multiple overlapping thoughts. A human with

a neuromuscular disorder is provided assistance using

a Brain-Computer Interface (BCI), in which the com-

putational model recognizes the thought of imagina-

tion of movement of the body limb and translates the

output to the control command such as movement of

the prosthetic limb and robotic arms. The signals are

captured using multiple receiving units, called chan-

nels of one or more types. Multiple units of these

sensors are placed in different regions of the skull to

capture the signals in the nearest portions of the brain.

It becomes very difﬁcult for a BCI to identify the best

channels. Using a channel selection algorithm aims to

enhance the classiﬁcation accuracy by reducing over-

a

https://orcid.org/0000-0002-9380-3923

b

https://orcid.org/0000-0002-8942-5114

ﬁtting issues and reducing computational complexity

while using EEG data. Channel selection is consid-

ered part of the feature extraction process.

The channels are required to be selected in the

most appropriate way so as to discriminate between

thoughts resulting in the most correct recognition of

the motor imagery thought patterns. Therefore, chan-

nel selection is viewed as a combinatorial problem

that is solved using an optimization technique. Re-

searchers have shown an interest in exploring the po-

tential of Particle Swarm Optimization (PSO) in solv-

ing the channel selection problem. Binary quantum

behaved particle swarm optimization (QBPSO) us-

ing Common Spatial Pattern (CSP), Fractional Order

Darwinian Particle Swarm Optimization (FODPSO)

algorithm and Binary Particle Swarm Optimization

(BPSO) have been proposed in the literature. (Zhang

and Wei, 2019; Sheoran and Saini, 2022; Kim et al.,

2012). Other works include the Sequential Floating

Forward Selection (SFFS) algorithm and Cohen’s d

effect size CSP (E-CSP) algorithm using z-score for

channel selection (Baig et al., 2020; Qiu et al., 2016;

Das and Suresh, 2015; Zhou et al., 2019).

Once the channels are selected, the most appro-

Chug, S. and Agarwal, V.

SENN: Self-evolving Neural Network to Recognize Motor Imagery Thought Patterns.

DOI: 10.5220/0011526800003332

In Proceedings of the 14th International Joint Conference on Computational Intelligence (IJCCI 2022), pages 349-358

ISBN: 978-989-758-611-8; ISSN: 2184-3236

Copyright © 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

349

(a) Raw EEG signals.

(b) After preprocessing and standardization.

Figure 1: Raw, Preprocessed and standardized signals rep-

resentation of Subject 5 of class 2 on BCI competition IV

2a dataset.

priate features are also extracted and subjected to the

classiﬁcation model. The classiﬁcation model repre-

sents the decision hyperplane and needs to have the

best parameters deﬁning that. The selected features

from each training data pair (x, y) form d-dimensional

feature vectors x =< x

1

,x

2

,x

3

,...,x

d

>. The raw EEG

data from selected channels are subjected to stan-

dardization to overcome the non-stationary behav-

ior of the signals (Ang et al., 2012). The effect of

preprocessing using standardization is depicted for a

few channels in Figure 1. The preprocessed data is

then used for feature extraction using various meth-

ods such as Fourier transform, Discrete Wavelet trans-

forms (DWT), and Haar wavelet (Nicolas-Alonso and

Gomez-Gil, 2012).

Various classiﬁcation algorithms are used for mo-

tor imagery task classiﬁcation, such as linear clas-

siﬁer, nonlinear Bayesian classiﬁer, nearest neigh-

bor classiﬁer, support vector machines (SVM), ra-

dial basis function neural network (RBFNN), deep

neural network, and combination of classiﬁers (Baig

et al., 2020; Nicolas-Alonso and Gomez-Gil, 2012;

Davoudi et al., 2017; Agarwal, 2019; Alam et al.,

2021; Bhatti et al., 2019; Zhao et al., 2020). Common

spatial pattern (CSP) algorithm was also used in EEG

classiﬁcation (Zhang and Eskandarian, 2020; Miao

et al., 2017; Ang et al., 2012). A pre-processing ﬁlter

approach Subject Speciﬁc Multivariate EMD Filter

(SS-MEMDBF) has been proposed where the ﬁlters

based on MEMD reduce the non-stationaries caused

by inter and intra-subject differences, thus obtaining

enhanced EEG signals (Xie et al., 2016). For classiﬁ-

cation, Riemannian mean computation for all classes

was used by the authors (Ko et al., 2018).

Of various classiﬁcation models such as a k-

nearest neighbor, support vector machines, and neu-

ral networks, the neural network has been used ef-

fectively in a variety of recognition tasks due to their

capability to handle nonlinearity in training data ef-

fectively. The radial basis function neural network

(RBFNN) is said to have the best approximation abil-

ity and is simple in its architecture with only one hid-

den layer (Haykin, 2005). The EEG data is highly

complex and poses challenges in identifying the num-

ber of neurons in the hidden layer. Particle Swarm

Optimization (PSO) has been used in different recog-

nition tasks and is said to converge to the optimal so-

lution provided the algorithm parameters are carefully

chosen.

In this paper, the two main problems are ad-

dressed, channel selection and RBFNN design. A

population of particles is initially randomly generated

and moved in the search space using a guided heuris-

tic, where each particle not only remembers its own

best solution found (known as the cognitive part), also

each particle knows the position in the search space

which is the best among all particles (this is known

as social intelligence). While many existing tech-

niques tend to display a greedy approach of optimiza-

tion leading to suboptimal solutions, PSO provides a

mechanism to explore the search space effectively and

to exploit the neighborhood. We are motivated by the

computational efﬁciency of particle swarm optimiza-

tion (PSO) and use it to address the issues appropri-

ately.

Our Contribution in this Paper: In this paper,

we propose a novel approach to classify the complex

EEG signals for four motor imagery classes. The ap-

proach uses PSO to select the most informative EEG

channels. The selected channels are used to extract

Haar features. The proposed classiﬁer design has self-

evolving hidden layer where the number and centers

of neurons are computed using K-Means and PSO al-

gorithm. This algorithm ﬁnds sub-clusters with sub-

jective similarities within each class. Based on the

number of natural clusters the centers are computed.

This paper is organized as follows: Section 2

presents the basic framework and the proposed algo-

rithm, section 3 describes the experimental part, sec-

tion 4 discusses the results and section 5 presents the

NCTA 2022 - 14th International Conference on Neural Computation Theory and Applications

350

Figure 2: Flow diagram of proposed model.

conclusion and future work.

2 PROPOSED ALGORITHM

In this study, we propose a technique that uses PSO

algorithm to capture the most informative channels

from a large channel set. In this technique, the clas-

siﬁcation is done by self-evolving radial basis neural

network. A self-evolving radial basis functions neu-

ral network is proposed based on sub-clusters within

each motor imagery task class. The k-means and PSO

clustering algorithms obtains the number, centers, and

spread of hidden neurons.

In the proposed algorithm, ﬁrst, we ﬁlter out ar-

tifacts from raw EEG signals with the help of But-

ter worth ﬁlter with a frequency cutoff between 4-

38HZ after extracting appropriate features from pre-

possessing data. Then class-wise K-Means clustering

is applied on selected features that provide center and

spread (sigma) value for each neuron. The RBFNN

model is generated, and each hidden layer neuron has

its center and sigma value. Finally, the RBFNN model

is used for classiﬁcation based on selected features.

Figure 2 represents the block diagram of the proposed

approach.

2.1 Prepossessing

EEG Brain signals are sensitive to noise, and remov-

ing artifacts from original signals is essential. Band-

pass ﬁltering and standardization are required before

feeding the raw data into our model (Ang et al., 2012).

As shown in Figure 1(a), the raw signal has high

distortion, but after ﬁltering out the artifacts using

the Butterworth ﬁlter, signals are relatively smooth in

[Figure 1(b)]. Based on(Ang et al., 2012), the Butter-

worth ﬁlter is engaged to ﬁlter out disentangle senso-

rimotor rhythms. Our model used a Butterworth ﬁlter

ranging from (4-38 Hz) because it contains the most

relevant information in motor imagery applications.

2.2 Standardization

Exponential moving standardization is employed to

reduce non-stationary ﬂuctuations (Ang et al., 2012).

Electrodes standardization was used to standardize

the band-pass ﬁlter data. The mathematical formu-

lation is deﬁned below:

s

′

t

=

s

t

− µ

t

p

σ

t

2

(1)

µ

t

= (1 − α)s

t

+ αµ

t−1

(2)

σ

2

t

= (1 − α)(s

t

− µ

t

)

2

+ ασ

2

t−1

(3)

where s

′

t

and s

t

are standardized signal and input sig-

nal at time t. µ

t

and σ

2

t

denote mean and variance

value of each electrode in a trial. The α is a parameter

known as the decay factor. Standardization removes

the occasional movements of the signals and protects

each trial’s trend from the past signal.

2.3 Channel Selection based on Particle

Swarm Optimization

Particle swarm optimization (PSO) is an optimiza-

tion algorithm based on animals/birds’ behavior.

In the PSO algorithm, swarm particles search for

food in a cooperative way, and each particle in

the swarm learns from its experience and another

particle experience for updating the search pattern

to ﬁnd the food. PSO is a popular and effective

global search technique. It is an appropriate al-

gorithm for addressing feature selection problems

for the following reasons: easy feature encoding,

global search capability, reasonable computation-

ally, fewer parameters, and easier implementation.

PSO algorithm ﬁnds optimal solution from a multi

dimensional search space. There are 25 (q say)

channels in BCI competition IV 2a dataset of which

only a few are signiﬁcant in our application. The

problem of ﬁnding p (say) best channels from q chan-

nels is an exponentially high combinatorial problem

SENN: Self-evolving Neural Network to Recognize Motor Imagery Thought Patterns

351

Algorithm 1: PSOCS.

Input: feature matrix

Output: Optimal channel set

Initalize:

P

i

,velo

i

,max iteration,Gbest, Pbest, c

1

c

2

Generate random particles (p) and each

particle range between [1 25]

for i=1 to no of particles do

calculate f itness f unction f it

i

updatePbest,Gbest

for K=1 to max iteration do

for each particle i do

r

1

= Generaterandom(0,1)

r

2

= Generaterandom(0,1)

// Random number uniformly

distributed between 0 and

1

w =

w

max

−

w

max

−w

min

I

max

∗ i

for each particle’s dimension j do

velocity part = w ∗ velo

k

i j

cognitive part =

c

1

r

1

(Pbest

k

i j

− P

k

i j

)

social part = c

2

r

2

(Gbest

k

j

− P

k

i j

)

velo

k+1

i j

= velocity part +

cognitive part + sociaL part

temp = P

k

i j

+ velo

k+1

i j

// Rounded values

if (temp <= 25&&temp >= 1)

then

P

k+1

i j

= P

k

i j

+ velo

k+1

i j

Calculate f itness f unction f it

J

U pdatePbest,Gbest

Optimal channel set = Gbest channels

and requires exponential time. In this paper, we use

PSO for selecting an optimal number of channels for

classiﬁcation as described in Algorithm 1. In poly-

nomial time the PSO algorithm starts with initializa-

tion of a population of particles randomly in channel

search space. Every particle Particle

i

{i = 1, 2, ...K}

has properties such as P

i

, velo

i

and Pbest, where P

i

is the position with velocity velo

i

and a memory of

personal best position Pbest. The global best (Gbest)

value deﬁnes the best particle found from all the parti-

cles. The population is a collection of particles where

each particle represents the selected channel set, and

each particle ranges between 1 and 25. The particle

velocity (velo) is updated by eqn(4), which is a com-

bination of three part namely: momentum (w∗ velo

t

i j

)

where w is weight inertia, velo

t

i j

is memory of previ-

ous t direction, (2) term is the cognitive part where it

quantiﬁes the performance relative to its past experi-

ence and (3) term is the social part where it quantiﬁes

the performance relative to its neighbour.

velo

k+1

i j

= w ∗ velo

k

i j

+ c

1

r

1

(Pbest

k

i j

− P

k

i j

) + c

2

r

2

(Gbest

k

− P

k

i j

)

(4)

Where k represents the previous iteration and k+1 is

the current iteration. The right hand side of eqn(4)

uses the values computed in the k

th

iteration and the

left hand side uses all to modify velocity of the parti-

cle in k + 1

th

iteration. Position of particle P

i j

is up-

dated at k + 1

th

iteration as given below

P

k+1

i j

= P

k

i j

+ velo

k+1

i j

(5)

where w is weight inertia that is calculated by eqn(6).

The acceleration coefﬁcient c

1

and c

2

are set to range

between 0 and 1. The parameters r

1

, r

2

are random

numbers uniformly distributed between 0 and 1. I

max

and I are the maximum and the current iteration, and

w

max

=0.9 and w

min

=0.4 are the initial and ﬁnal value

of weight inertia.

w = w

max

−

w

max

− w

min

I

max

∗ i

(6)

Each particle is designed to represent a vector of num-

bers from 1 to 25. They are randomly generated initial

and later it moves to a position representative another

set of channels. The particle can change the number

of unique channels while it moves in the 25 dimen-

sional space. The ﬁtness function for channel selec-

tion problem is taken as the accuracy of classiﬁcation.

2.4 Clustering

Clustering is the process of dividing a set of data

points into groups so that data points in the same

group are more similar than data points in other

groups. In other words, the goal is to separate groups

with similar characteristics and assign them to clus-

ters. K-means clustering is simpler and has a lin-

ear time complexity with data size (less expensive).

The class-wise K-Means Clustering Algorithm starts

by initializing the k number of centers randomly and

assign the data point x to one of the K subsets in p

th

class. It then uses a procedure to end up with a par-

tition of the data points into K disjoint clusters clust

in p

th

class. Then we combine all class cluster data

points T and check that the sum of the distance of

each cluster member to its cluster center should be

minimized.

J = min

k

∑

i=1

∑

dεclust

j

X

d

−C

j

(7)

NCTA 2022 - 14th International Conference on Neural Computation Theory and Applications

352

Algorithm 2: SENN.

Input: Feature matrix[features extracted

using channels selected by PSOCS

(Algorithm 1)], class label

Class cluster no = n;

// Create Class cluster no random

center class wise

for i=1 to no. of classes do

class cluster= class i

th

training data;

for j=1 to Class cluster no do

Pos= random position in class cluster

center = class cluster(Pos)

center= Concatenate all class cluster’s

center

cluster info=class information of all

cluster

while True do

for i=1 to no. of training sample do

class i cluster = all center whose

class is same as class label i

for j=1 to no. of class i cluster do

Dist

j

=

distance(x data

i

,class i cluster

j

)

Min dist = min(Dist)

cluster index=cluster index whose

distance is minimum

count(cluster index) = count + 1

sum(cluster index) =

sum(cluster index)+ x data(i)

cluster no = cluster index

// find new center

for i=1 to center do

// only consider those

cluster whose count is not

equal to zero

if count ̸= 0 then

mean(i) = sum(i)/count(i)

center new = mean

// Sigma value

for i=1 to center do

for j=1 to i

th

cluster do

σ

i

=

1

p

∑

cluster

i

j

−Center

i

2

// stopping condition

if center new==center then

break;

center = center new

for i=1 to no of training sample do

for j=1 to no of center do

φ

j

(x) = e

−

∥

x data

i

−center

j

∥

2

2σ

2

j

Output: φ

where C

j

is the mean of the data points in set clust

j

given by

C

j

=

1

T

j

∑

dεclust

j

X

d

(8)

Iteratively searching the closest mean C

j

to each data

point x

p

reallocating and the data points to the associ-

ated clusters clust

j

, and then recomputing the cluster

means C

j

. The K-means clustering terminates when

no data points change their cluster from one to an-

other. Multiple runs can be carried out to ﬁnd the lo-

cal minimum with the lowest J.

2.5 Self Evolving Radial Basis

Functions Neural Network Design

RBFNN is a single hidden layer network with an in-

put layer fully connected to a hidden layer. Then, the

output of the hidden layer performs a weighted sum

of input features to get the output. Unlike the Multi

Layer Perceptron (MLP), calculating weights for lay-

ers in RBFNN is very different; even the interpreta-

tion role of hidden layer nodes is easy. The RBFNN

topological diagram is explained in Figure 3. The hid-

Figure 3: RBFNN structure.

den layer to the output layer works in the same way

as feed-forward MLP, with the sum of the weighted

hidden unit activation given the output unit activa-

tion by eqn(10). The hidden unit activation functions

are given by the basis function φ(x, c

j

,σ

j

), which de-

pends on the center, sigma and the input data. The

mathematical form of RBF Gaussian function

φ

j

(x) = e

−

∥

x

i

−c

j

∥

2

2σ

2

(9)

y

j

(x) =

k

∑

i=1

w

ji

φ

i

(x)) (10)

The sigma σ values are deﬁned, usually with the p-

closest neighbor method which alters the values to

achieve overlapping of the response of every hidden

SENN: Self-evolving Neural Network to Recognize Motor Imagery Thought Patterns

353

and its neighboring unit. The function used is:

σ

j

=

1

p

∑

X

i

−C

j

2

(11)

where c

j

the p-closest neighbor of X

i

To explain the working ﬂow of RBFNN, Suppose

we have X data set with N samples and D dimen-

sion (x data

i

,y

i

) where x data

i

is input data, and y

i

is its class label. The output of the hidden unit ac-

tivation function φ

j

computed by eqn(9) is based on

the distance between x data

i

and center

j

. The spread

of cluster is also used in the formation of φ

j

. The

weights between the hidden and output layers are

calculated using Moore-Penrose generalized pseudo-

inverse. Then, the output of the network is calculated

using eqn(10).

2.5.1 Center Generation

Algorithm 2 discusses the implementation of self-

evolving neural network (SENN) which ﬁnds cen-

ters of hidden neurons for RBFNN. These centers are

based on the underlying structure of EEG data which

uses K-Means clustering. Also, we experimented the

evolution of RBFNN using the PSO algorithm to pro-

duce classwise centres. At ﬁrst, random centres were

chosen, and each center’s dimensions were within a

dimension-wise range. Then the particles move us-

ing eqn(4). The center dimension search space is c ∗d

where c is the number of centers and d is the number

of dimensions. A particle P

i

is moved by its velocity

Velo

i

in the search space at time t is represented as P

t

i

.

Figure 4: PSO convergence with respect to very small value

of c1 on Subject 5 BCI competition IV datasetIIa.

The position of centers is updated according to

the global and local best in c ∗ d search space where

Figure 5: PSO convergence with respect to very small value

of c2 on Subject 5 BCI competition IV datasetIIa.

each row represents positions of sub-cluster centers.

Let there be a population P of particles that search

exploring and exploiting through their interactions.

The mathamatical formation of velocity movement

towards minima shown in eqn(4). The step size taken

by particles is 0.0001 for C

1

and C

2

. Both algorithms

employ the same goal function for convergence as il-

lustrated in eqn(7). We already covered K-Means cen-

tres generation part in section 2.4.

3 DATASET AND

EXPERIMENTAL DETAILS

3.1 Dataset Description

BCI Competition IV dataset IIa has four classes of

motor imagery tasks. The EEG signals were collected

from nine volunteers for four classes: left hand, right

hand, tongue, and feet movement. Two sessions for

motor imagery tasks were recorded from each sub-

ject, one for training and the other for testing. Each

session contains 288 trials recorded with 25 channels

(22 EEG Channels and 3 EOG channels). These chan-

nels are associated with right and left-hand motor im-

agery areas.

BCI Competition IV dataset IIb has two classes

of motor imagery tasks. The EEG signals were col-

lected from nine volunteers for two classes: left-hand

and right-hand movements. Two sessions of motor

imagery tasks were recorded from each subject, one

for training and the other for testing. Each session

contains 120 trials of data recorded with 3 channels

(http://www.bbci.de /competition/iv/).

NCTA 2022 - 14th International Conference on Neural Computation Theory and Applications

354

We categorized the BCI competition data set into

two phases: the subject dependent dataset and the

subject independent dataset. In the dependent dataset,

data is used subject-wise, where each subject has

training and testing data separately for each motor im-

agery class. There are nine subjects in the BCI com-

petition IV IIa dataset, so we train the model subject

wise and test its performance accordingly on the test-

ing dataset.

For validating our algorithm on the subject-

independent dataset, we merge all subject’s training

data into one set and train the model. The testing

is done on combined testing data. We do not know

about the subject speciﬁcation during training and

testing in the independent phase. In the dependent

dataset phase, the model’s training is subject wise, but

model training is subject-independent in the indepen-

dent dataset phase.

Figure 6: Number of times channel comes in selected chan-

nels set from subject 1-9 in BCI competition IV IIa dataset

[Channels 6,12,18,23 and 25 are signiﬁcant for 8 out of 9

subjects].

3.2 Experiment Evaluation

The performance of the proposed algorithm is eval-

uated on BCI Competition IV datasets IIa and IIb,

as discussed above. The BCI Competition IV IIa

dataset has nine subjects, as is mentioned above, and

each subject data partitioned into training and testing.

We train the model on the training dataset and vali-

date the accuracy of the testing dataset. We ﬁnd the

best channels by using PSOCS; the population size of

PSOCS is 20, and parameters c1 c2 are shown in Ta-

ble I; for all subjects. Features were extracted using

Haar wavelet with parameters n=5 and m=4 (where n

and m are decomposition levels) from selected chan-

nels on the training dataset. We pass the data to

the RBFNN K-Means classwise classiﬁer for classi-

ﬁcation. In the RBFNN K-Means classwise classi-

ﬁer, ﬁrst, we grouped similar thought patterns into

sub-clusters and generated their center and spread

values. We pass center and spread parameter to

RBFNN model for classiﬁcation. In the RBFNN K-

Means classwise model, we start with k equal to the

number of samples in the class, and at each iteration,

we remove clusters with no single point. Two perfor-

mance measures were used to evaluate the proposed

algorithm; cohen’s kappa coefﬁcient(k) and accuracy

(Acc). The accuracy (Acc) was computed as the ratio.

Acc =

∑

n

t=1

n

t

1

∑

n

t=1

n

t

2

∗ 100 (12)

Cohen’s kappa coefﬁcient given below is used for

evaluating the performance of the proposed algo-

rithm.

κ =

P

o

− P

a

1 − P

a

(13)

P

a

=

1

N

2

n

∑

t=1

n

t

1

∗ n

t

2

(14)

where, P

o

and P

a

represent observed agreement and

chance agreement on test samples respectively. N is

the total number of test samples, n is the total number

of classes, n

t

1

is the total number of samples predicted

to be belonging to class t and n

t

2

is the total number of

samples from actual class t.The proposed algorithm

was implemented using MATLAB R2020b and all ex-

periments were performed on Intel(R) Core(TM) i5-

4590 CPU Processor(3.30GHz).

4 RESULTS AND DISCUSSION

In this section, we discuss the results of the proposed

algorithm validates on datasets IIa and IIb from the

BCI Competitions IV. (i) Analysis of PSOCS() algo-

rithm: The PSOCS method was used to choose the

best channels out of 25 channels, with the parame-

ters c

1

and c

2

tuned between 0.0001 and 10. To ex-

plore the search space, the values of c

1

is varied from

0.0001 to 10 in multiples of 10 [Figure 4]. The con-

vergence of the proposed PSOCS was investigated for

subject 5 for 20 iterations. In algorithm this experi-

ment, a population size of 20 particles was used. It

is observed that the best accuracy of 85.5% is ob-

tained with c

1

= 0.0001. Figure 5 depicts the same

experiment for dataset IV IIa and gives an accuracy

of 79.8% with c

1

= 0.001.

Subject-wise parameters and channel details are

present in Table 1. For each subject, the best channels

were obtained. In order to understand the signiﬁcance

of each channel across all 9 subjects, we computed

the total number of subjects where a channel was se-

lected. This number is plotted on y-axis in Figure 6.

Channels such as 5,12,18,24 and 25 were selected by

our algorithm for 8 (out of 9 subjects) emphasizing

SENN: Self-evolving Neural Network to Recognize Motor Imagery Thought Patterns

355

Table 1: Parameters and channel details of every subject of BCI competition IV IIa dataset using PSOCS().

Subject C1 C2 Channels Accuracy(%)

Sub 1 0.0001 0.01 3 5 7 13 14 15 16 17 20 21 22 24 25 54.51±4.06

Sub 2 0.0001 0.001 4 5 6 8 10 11 12 13 14 15 18 21 23 24 25 49.65±3.12

Sub 3 0.01 0.1 5 7 8 9 10 11 12 14 16 18 19 21 22 23 25 59.13±2.51

Sub 4 0.001 0.1 4 5 9 10 12 13 15 17 18 19 20 22 23 24 25 53.48±3.96

Sub 5 0.0001 0.001 1 4 5 6 7 11 12 13 18 19 20 21 23 24 25 72.86±2.96

Sub 6 0.01 0.1 3 4 5 6 8 9 10 12 13 14 15 16 18 19 20 22 23 24 25 50.45±4.23

Sub 7 1 0.01 2 5 9 10 12 13 15 17 18 21 22 23 25 73.00±3.22

Sub 8 0.01 0.01 4 5 6 7 8 10 12 13 15 17 18 19 20 21 22 23 24 25 67.40±2.86

Sub 9 0.001 0.1 2 7 9 10 11 12 14 15 16 17 18 20 21 22 23 24 55.25±4.02

Table 2: Kappa values of the proposed PSOCS() algorithm and existing approaches for BCI competition IV IIa dataset.

Study Approach 1 2 3 4 5 6 7 8 9 Mean

(Xie et al., 2016) TSSM+LDA 0.77 0.33 0.77 0.51 0.35 0.36 0.71 0.72 0.83 0.59

(Miao et al., 2017) DSFTP 0.63 0.43 0.74 0.54 0.19 0.26 0.63 0.62 0.69 0.53

(Zhang and Eskandarian, 2020) TFCSP 0.62 0.36 0.76 0.40 0.29 0.34 0.59 0.57 0.62 0.51

(Ko et al., 2018) RSTNN 0.69 0.29 0.68 0.34 0.09 0.30 0.57 0.49 0.56 0.45

(Ang et al., 2012) FBCSP 0.68 0.42 0.75 0.48 0.40 0.27 0.77 0.75 0.61 0.57

Proposed Algorithm 0.58 0.49 0.57 0.52 0.72 0.48 0.74 0.67 0.54 0.59

Table 3: Accuracy and Kappa values for all subject on BCI competition IV IIb dataset using selected channels obtained by

PSOCS().

Subject Sub1 Sub2 Sub3 Sub4 Sub5 Sub6 Sub7 Sub8 Sub9

Accuracy (%) 52.12±2.16 58.37±3.56 58.7±5.86 52.13±4.77 73.63±2.34 53.78±4.35 49.14±3.06 59.13±4.06 58.37±2.98

Kappa 0.44±0.02 0.45±0.03 0.49±0.04 0.39±0.04 0.64±0.02 0.44±0.05 0.35±0.03 0.49±0.04 0.46±0.03

Figure 7: Convergence of PSOCS() algorithm on subject 5.

their importance in thought classiﬁcation. The com-

parative analysis with previous studies of BCI compe-

tition IV dataset IIa shows that the proposed algorithm

outperforms some of the existing techniques [Table

2]. Table 3 summarises the performance of our algo-

rithm in terms of accuracy and kappa value of all sub-

jects with all channels of BCI competition IV dataset

IIb. Table 4 shows the confusion matrix for subject 1

of BCI competition IV dataset IIa.

(ii) Analysis of SENN: In Table 5 we can analyze

the self-evolving property of the model by using K-

means and PSO algorithm. Each class has, on an aver-

age, 60 samples per subject. In the beginning, we pass

60 clusters for each class, and after that clusters self-

NCTA 2022 - 14th International Conference on Neural Computation Theory and Applications

356

Table 4: Confusion matrix for subject 1.

Predicted class

1 2 3 4

1 35 16 13 8

2 14 44 9 5

3 8 8 50 6

Actual

class

4 7 6 7 52

Table 5: Number of hidden neurons produced by SENN()

evolved by Kmean and PSO algorithm in each class of sub-

ject 5 BCI competition IV dataset IIa.

Kmean PSO

Class 1 40 ±1.58 17.4 ±2.61

Class 2 39.2 ±2.95 17.8±1.48

Class 3 38 ±2.83 17.6±4.16

Class 4 37.8 ±2.49 19.4±0.89

Accuracy 73.0 ±3.22 65.8±2.59

evolve by using the K-means and PSO algorithms. A

characteristic of each neuron in the RBFNN model is

the center and spread information of those clusters. It

is evident that the total number of evolved neurons is

the sum total of clusters of all four classes.

(iii) Subject independent Analysis: Figure 7

demonstrates the convergence of the classwise clus-

ters of subject 5 BCI competition IV dataset IIa using

the PSO algorithm, with the parameters c1 and c2 set

to 0.0001. In 200 iterations, the objective function’s

value decreases. If the c1 and c2 values are more than

0.0001, there is a high risk of local minima due to the

cluster’s rapid convergence.

In subject independent BCI com-

petition IV dataset IIa, 11 channels

[3,6,7,11,12,15,18,22,23,24,25] were selected

with parameters c

1

=0.1 c

1

=0.6 and population size

is taken as 20 on 50 iteration. Figure 8 shows

the convergence of PSOCS on independent BCI

competition IV dataset IIa.

Table 6 summarizes the cluster information for

each class. Initially, we pass 100 clusters in each

class and get the converged cluster 352 clusters out of

500. The convergence of Kmean clustering is shown

in Figure 9, wherein the value of the objective func-

tion is minimized in each iteration. Table 7 summa-

rizes all the parameters and accuracy of the indepen-

dent dataset.

Table 6: Number of hidden neurons evolved in each class

of subject independent BCI competition IV dataset IIa.

Class 1 Class 2 Class 3 Class 4 Total

Initial no

of Cluster

100 100 100 100 500

Obtained

Cluster

81 89 88 90 352

Table 7: Accuracy of PSOCS() on independent BCI com-

petition IV dataset IIa.

C

1

C

2

Channels Accuracy

0.1 0.6 3,6,7,11,12,15,18,22,23,24,25 71.45%

Figure 8: PSOCS() convergence on independent BCI com-

petition IV IIa dataset.

Figure 9: Clustering convergence on independent BCI com-

petition IV dataset IIa [eqn 7 used for objective function

value].

5 CONCLUSION AND FUTURE

WORK

The proposed model learns from the underlying data

and evolves the RBFNN hidden neuron in terms of

numbers and locations of the centers. Therefore, the

proposed work can easily be used in other applica-

tions independent of the domain-speciﬁc knowledge,

SENN: Self-evolving Neural Network to Recognize Motor Imagery Thought Patterns

357

given the feature vectors representing samples. The

centers of the hidden neurons capture similar clusters

of the training data from a given class. The com-

plex phenomenon of thought patterns is handled ef-

ﬁciently using the proposed algorithm. Our algorithm

outperforms those with an accuracy of 71.45% for

subject-independent motor imagery task classiﬁcation

for the dataset IIa. The mean kappa value for subject-

dependent task classiﬁcation is obtained as 0.59 for

the same dataset. The algorithm also performed well

for dataset IIb. In the future, we will explore the

potential of this algorithm to work with more com-

plex thought classes such as music and mathematic

problem solving. We plan to use the transfer learning

methods for other mental activity recognition.

REFERENCES

Agarwal, V. (2019). Adaptive radial basis functions neu-

ral network for motor imagery task classiﬁcation. In

2019 Twelfth International Conference on Contempo-

rary Computing (IC3), pages 1–6. IEEE.

Alam, M. N., Ibrahimy, M. I., and Motakabber, S. (2021).

Feature extraction of eeg signal by power spectral den-

sity for motor imagery based bci. In 2021 8th Interna-

tional Conference on Computer and Communication

Engineering (ICCCE), pages 234–237. IEEE.

Ang, K. K., Chin, Z. Y., Wang, C., Guan, C., and Zhang, H.

(2012). Filter bank common spatial pattern algorithm

on bci competition iv datasets 2a and 2b. Frontiers in

neuroscience, 6:39.

Baig, M. Z., Aslam, N., and Shum, H. P. (2020). Filtering

techniques for channel selection in motor imagery eeg

applications: a survey. Artiﬁcial intelligence review,

53(2):1207–1232.

Bhatti, M. H., Khan, J., Khan, M. U. G., Iqbal, R., Alo-

qaily, M., Jararweh, Y., and Gupta, B. (2019). Soft

computing-based eeg classiﬁcation by optimal feature

selection and neural networks. IEEE Transactions on

Industrial Informatics, 15(10):5747–5754.

Das, A. and Suresh, S. (2015). An effect-size based chan-

nel selection algorithm for mental task classiﬁcation

in brain computer interface. In 2015 IEEE Interna-

tional Conference on Systems, Man, and Cybernetics,

pages 3140–3145. IEEE.

Davoudi, A., Ghidary, S. S., and Sadatnejad, K. (2017). Di-

mensionality reduction based on distance preservation

to local mean for symmetric positive deﬁnite matrices

and its application in brain–computer interfaces. Jour-

nal of neural engineering, 14(3):036019.

Haykin, S. (2005). Neural Networks. PEARSON.

Kim, J.-Y., Park, S.-M., Ko, K.-E., and Sim, K.-B. (2012).

A binary pso-based optimal eeg channel selection

method for a motor imagery based bci system. In In-

ternational Conference on Hybrid Information Tech-

nology, pages 245–252. Springer.

Ko, W., Yoon, J., Kang, E., Jun, E., Choi, J.-S., and Suk,

H.-I. (2018). Deep recurrent spatio-temporal neural

network for motor imagery based bci. In 2018 6th In-

ternational Conference on Brain-Computer Interface

(BCI), pages 1–3. IEEE.

Miao, M., Zeng, H., Wang, A., Zhao, C., and Liu, F.

(2017). Discriminative spatial-frequency-temporal

feature extraction and classiﬁcation of motor im-

agery eeg: An sparse regression and weighted na

¨

ıve

bayesian classiﬁer-based approach. Journal of neuro-

science methods, 278:13–24.

Nicolas-Alonso, L. F. and Gomez-Gil, J. (2012). Brain

computer interfaces, a review. sensors, 12(2):1211–

1279.

Qiu, Z., Jin, J., Lam, H.-K., Zhang, Y., Wang, X., and Ci-

chocki, A. (2016). Improved sffs method for channel

selection in motor imagery based bci. Neurocomput-

ing, 207:519–527.

Sheoran, P. and Saini, J. (2022). Optimizing channel selec-

tion using multi-objective fodpso for bci applications.

Brain-Computer Interfaces, 9(1):7–22.

Xie, X., Yu, Z. L., Lu, H., Gu, Z., and Li, Y. (2016).

Motor imagery classiﬁcation based on bilinear sub-

manifold learning of symmetric positive-deﬁnite ma-

trices. IEEE Transactions on Neural Systems and Re-

habilitation Engineering, 25(6):504–516.

Zhang, C. and Eskandarian, A. (2020). A computation-

ally efﬁcient multiclass time-frequency common spa-

tial pattern analysis on eeg motor imagery. In 2020

42nd Annual International Conference of the IEEE

Engineering in Medicine & Biology Society (EMBC),

pages 514–518. IEEE.

Zhang, L. and Wei, Q. (2019). Channel selection in motor

imaginary-based brain-computer interfaces: a particle

swarm optimization algorithm. Journal of integrative

neuroscience, 18(2):141–152.

Zhao, H., Zheng, Q., Ma, K., Li, H., and Zheng, Y.

(2020). Deep representation-based domain adapta-

tion for nonstationary eeg classiﬁcation. IEEE Trans-

actions on Neural Networks and Learning Systems,

32(2):535–545.

Zhou, B., Wu, X., Ruan, J., Zhao, L., and Zhang, L. (2019).

How many channels are suitable for independent com-

ponent analysis in motor imagery brain-computer in-

terface. Biomedical Signal Processing and Control,

50:103–120.

NCTA 2022 - 14th International Conference on Neural Computation Theory and Applications

358