A Novel Deep Learning Power Quality Disturbance Classification
Method using Autoencoders
Callum O’Donovan, Cinzia Giannetti and Grazia Todeschini
College of Engineering, Swansea University, Fabian Way, Swansea, Wales, U.K.
Keywords: Classification, Feature Extraction, Power Quality Disturbance, Deep Learning, Convolutional Neural
Network, LSTM, Recurrent Neural Network, Autoencoder.
Abstract: Automatic identification and classification of power quality disturbances (PQDs) is crucial for maintaining
efficiency and safety of electrical systems and equipment condition. In recent years emerging deep learning
techniques have shown potential in performing classification of PQDs. This paper proposes two novel deep
learning models, called CNN(AE)-LSTM and CNN-LSTM(AE) that automatically distinguish between
normal power system behaviour and three types of PQDs: voltage sags, voltage swells and interruptions. The
CNN-LSTM(AE) model achieved the highest average classification accuracy with a 65:35 train-test split. The
Adam optimiser and a learning rate of 0.001 were used for ten epochs with a batch size of 64. Both models
are trained using real world data and outperform models found in literature. This work demonstrates the
potential of deep learning in classifying PQDs and hence paves the way to effective implementation of AI-
based automated quality monitoring to identify disturbances and reduce failures in real world power systems.
1 INTRODUCTION
In ideal power systems, voltage and current
waveforms are sinusoids at fundamental frequency
(i.e., 50 Hz or 60 Hz for Europe or the USA mains
respectively) (Baggini, 2008). While amplitude of the
voltage waveform is strictly regulated and maintained
close to the rated value, the current waveform is more
variable, as it depends on the rating of loads connected
to the system and their power demand. Any deviation
from the ‘ideal’ waveform is defined as a power quality
disturbance (PQD) (Bollen, 2003). Numerous PQDs
exist in practice, and power quality standards have
been developed to provide a classification of each
disturbance and to provide acceptable limits (IEEE
Standards Association, 2019).
Because voltage waveforms are generally more
stable and less subject to fluctuations of electricity
demand, this research focused on the classification of
voltage signals, and on the identification of three
PQDs: voltage sags/dips, voltage swells and
interruptions. Voltage sag is a reduction in voltage
amplitude between 5-90% of the nominal (rated)
voltage, voltage swell is an increase of the voltage
amplitude above 105% of the nominal voltage, and an
interruption is a reduction of the voltage amplitude
below 10% of the nominal voltage.
Small deviations from the rated voltage value are
acceptable and do not harm the electricity system or
the equipment. With increasing levels of PQDs, some
detrimental effects can be observed. For example,
excessive fluctuations of the voltage waveform for
extended periods of time may lead to damage of
equipment connected to the power grid, such as motor
failures (Wang & Chen, 2019).
In recent years, with the increase of power-
electronics based devices connected to the power grid
(such as renewable energy sources and electric
vehicles), PQDs have become more common, thus
leading to concerns for utilities and power system
operators in terms of guaranteeing the quality of the
electrical energy supplied to their customers. As a
result, increasing numbers of power quality monitors
are currently being installed, thus allowing the
collection of large amounts of voltage and current
data (Demirci et al., 2011). Analysis of these
waveforms and identification of PQDs allows
implementing mitigating solutions, thus improving
system operating conditions and extending the life-
time of the equipment (Wang & Chen, 2019).
Various PQD classification methods exist, as
described in (Demirci et al., 2011). Historically,
PQDs have been classified using visual inspection of
the voltage and current waveforms (Wang & Chen,
O’Donovan, C., Giannetti, C. and Todeschini, G.
A Novel Deep Learning Power Quality Disturbance Classification Method using Autoencoders.
DOI: 10.5220/0010347103730380
In Proceedings of the 13th International Conference on Agents and Artificial Intelligence (ICAART 2021) - Volume 2, pages 373-380
ISBN: 978-989-758-484-8
Copyright
c
2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
373
2019). Later on, techniques have been developed to
automatically detect and classify PQDs, based on
signal processing techniques (Bravo-Rodriquez,
Torres and Borrás, 2020). In recent years, some
methods have been proposed to provide automatic
classification of PQDs using Big Data with machine
learning (Wang & Chen, 2019).
Machine learning is a broad term that refers to
algorithms that can learn from large amount of data.
In recent years, machine learning has gained much
popularity due to development of more accurate
algorithms, increased training data availability and
increased computational resources worldwide
(Jordan & Mitchell, 2015). Machine learning models
can be used for a vast range of tasks such as credit-
card fraud detection, speech recognition and medical
diagnosis (Jordan & Mitchell, 2015). Deep Learning
refers to the particular type of Machine Learning
techniques used for learning high-level features from
data in a hierarchical manner using stacked, layer-
wise architectures (Goodfellow & Bengio, 2015).
Among these are convolutional neural networks
(CNNs), long short-term memory networks
(LSTMs), convolutional autoencoders (CAEs), and
LSTM autoencoders. Deep Learning models
demonstrate excellent predictive capabilities in image
and speech recognition, natural language processing
(NLP), and intelligent gamification (Goodfellow &
Bengio, 2015). We demonstrate the application of
Deep Learning to automatically detect PQ events
using real world datasets. The proposed deep learning
models are capable of accurately classifying four
different types of PQ events and outperform other
models proposed in the literature.
Machine learning models explored in this paper
are a combination of several techniques including
convolutional neural networks (CNNs), long short-
term memory networks (LSTMs), convolutional
autoencoders (CAEs), and LSTM autoencoders.
The paper is organised as follows. Section 1
includes background information on machine
learning techniques used for PQD classification.
Section 2 describes the methodology; results are
presented in section 3 and the paper is concluded in
Section 4.
2 BACKGROUND
In this section a review of deep learning techniques in
the context of PDQs is provided.
2.1 Convolutional Neural Networks
(CNNs)
CNNs are a type of artificial neural network (ANN)
used for feature extraction, that primarily take images
as input but can also handle other data such as words
and temporal signals (O’Shea & Nash, 2015),
(Kalchbrenner, Grefenstette & Blunsom, 2014),
(Palaz, Collobert & Magimai-Doss). In (Bagheri, Gu
& Bollen, 2018), deep CNNs were utilised to perform
automatic feature extraction and classify different
types of voltage dips (sags) recorded by power quality
monitors (specifically, the PQube meters). Pre-
processed data was used rather than raw data. The
model was trained and tested as case studies (C1, C2
and C3) which handled three voltage datasets in a
different way. The three data set were from Sweden
(D1), the world (D2) and the UK (D3) (Bagheri et al.,
2018). This method is summarised in Table 1.
Table 1: Summary of the different ways data was used in
(Bagheri, Gu & Bollen, 2018).
Case Study Training Set Testing Set
C1
0.75
(D1+D2+D3)
0.25
(D1+D2+D3)
C2 D2+D3 D1
C3 D1+D2 D3
Model performance was represented as loss,
accuracy, classification rate and false alarm rate, but
only accuracy will be discussed here to align with this
project (Bagheri et al., 2018). For C1, C2 and C3,
accuracy was 97.72%, 95.18% and 93.59%,
respectively (Bagheri et al., 2018). Results suggest
CNNs are effective for PQD classification, works in
the literature also use data from
http://map.pqube.com/. Architecture proposed in
(Bagheri et al., 2018) is summarised in Table 2. Both
batch size and epochs were set to 250, and the Adam
optimiser was applied (Bagheri et al., 2018).
Table 2: Summary of architecture proposed in (Bagheri, Gu
& Bollen, 2018).
Layers
Filter Size /
No. of cells
2D Conv1+ReLU (5, 5) x 16
2D Conv1+ReLU+Max-
Pooling
(3, 3) x 32
2D Conv1+ReLU (3, 3) x 64
2D Conv1+ReLU+Max-
Pooling
(3, 3) x 128
FC1+ReLU 1024
FC2+ReLU 128
FC3+Softmax 7
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
374
Also supporting use of CNNs to classify PQDs is
(Balouji & Salor, 2017), as it applies CNNs with real
event images from four transmission substations and
achieves 100% accuracy. The architecture proposed
in (Balouji & Salor, 2017, pp219) is similar to
(Bagheri et al., 2018, pp4); the main difference
between the two papers is that (Bagheri et al., 2018)
worked with pre-processed data whilst (Balouji &
Salor, 2017) used images of voltage waveforms. This
project will apply raw data, but these two alternatives
should be considered in future. The study in (Balouji
& Salor, 2017) also found that using 65 to 135 epochs
was most suitable.
2.2 Long Short-term Memory
Networks (LSTMs)
LSTMs are a type of recurrent neural network (RNN)
that deal with data in a sequential format (Baccouche,
Mamalet, Wolf, Garcia & Baskurt, 2011). This,
combined with their gating system which gives them
a ‘memory’ (as it will be explained later), means that
LSTMs are able to put data into context (Baccouche
et al., 2011).
LSTMs were utilised for automatic feature
extraction and classification of three-phase voltage
dips collected from various countries (Balouji, Gu,
Bollen, Bagheri & Nazari, 2018). Model performance
was evaluated using classification rate and false alarm
metrics on the test set. Seven different classes were
defined, and the median average classification and
false alarm rates of dip types were 93.4% and 7.78%
respectively (Balouji et al., 2018). Additionally, four
LSTM layers were implemented with a hyperbolic
tangent activation for feature extraction, so that
different layers could extract the multiscale features
(Balouji et al., 2018). Each LSTM layer is
accompanied by a batch normalisation layer, and a
fully-connected layer (also known as a dense layer)
with softmax activation for classification is used as the
final layer (Balouji et al., 2018). Before feature
extraction and classification, the method pre-processed
the voltage sequence data into root mean square (RMS)
sequence data and divided it into segments for
computational efficiency (Balouji et al., 2018).
In (Katić & Stanisavljević, 2018) a LSTM-based
network was proposed to automatically detect and
classify voltage dips in real, simulated and
laboratory-produced data. High model performance
was shown through overall classification accuracy
exceeding 97% (Katić & Stanisavljević, 2018).
Together, (Balouji et al., 2018) and (Katić &
Stanisavljević, 2018) suggest application of LSTM
layers to classify PQDs could be effective.
2.3 Convolutional Autoencoders
(CAEs)
CAEs take CNNs’ ability to extract spatially local
features, but also employ autoencoders’ (AEs) ability
to learn features from unlabelled data (unsupervised
learning), which allows distinction of more subtle
features than a CNN could identify alone (Seyfioğlu,
Özbayoğlu & Gürbüz, 2018).
A comparison of performances of a convolutional
autoencoder to a multiclass support vector machine
(SVM), an autoencoder and a CNN, when classifying
different types of human activity based on radar
measurements, is shown in (Seyfioğlu et al., 2018).
Results showed that accuracies of a multiclass SVM,
an autoencoder, a CNN and a CAE were 76.9%,
84.1%, 90.1% and 94.2% respectively. This suggests
that using a deep CAE (DCAE) during model
development could result in a higher performance
model than using a traditional CNN or AE.
This approach is supported by another research
work that compared performance of a DCAE to SVM,
sparse representation classifier (SRC) and stacked
autoencoder (SAE) models when classifying high-
resolution synthetic aperture radar (SAR) images
(Geng, Fan, Wang, Ma & Chen, 2015). Results
showed overall accuracy of the SVM, SRC, SAE and
DCAE models were approximately 76.92%, 81.08%,
82.45% and 88.11% respectively (Geng et al., 2015).
Results presented in this work further support
application of DCAE models for classification because
DCAE accuracy significantly exceeds accuracy of
other models. Furthermore, results showed that the
DCAE was most accurate at classifying four out of five
individual classes (Geng et al., 2015).
Even though both (Seyfioğlu et al., 2018) and
(Geng et al., 2015) propose ‘deep convolutional
autoencoders’, their interpretation of this
nomenclature is different. In (Seyfioğlu et al., 2018),
three convolutional layers are used on each side, with
max-pooling and unpooling layers located between
them. The number of filters applied decreases across
each layer for the first three convolutional layers
(encoding) and increases across each layer for the last
three convolutional layers (decoding) (Seyfioğlu et
al., 2018). However, (Geng et al., 2015) proposes a
convolutional layer in the same network as a
traditional sparse autoencoder (encoder-decoder
made from fully-connected layers rather than
convolutional layers).
As results from (Seyfioğlu et al., 2018) were
encouraging, application of six convolutional layers
for a CAE will be tested.
A Novel Deep Learning Power Quality Disturbance Classification Method using Autoencoders
375
2.4 LSTM Autoencoders
LSTM autoencoders are also known as sequence-to-
sequence autoencoders and have been shown to be
successful in tasks such as machine translation,
natural language generation and reconstruction and
image captioning (Mehdiyev, Lahann, Emrich, Enke,
Fettke & Loos, 2017).
A GRU-based autoencoder presented in
(Amiriparian, Freitag, Cummins & Schuller, 2017)
successfully classified labelled acoustic scene audio
data with accuracy of 88%. LSTM units were adopted
instead of GRUs during model design, but did not
show a performance improvement, suggesting value
in experimenting with GRU and LSTM autoencoders.
Specific parameter values for the architecture
proposed in (Amiriparian et al., 2017) are not given,
but the general idea is that RNN layers define the
encoder, followed by a fully-connected layer with a
hyperbolic tangent activation function. The final
layers consist of RNN layers for the decoder,
followed by a linear projection layer with a
hyperbolic tangent activation function are also
applied to the RNN layers’ inputs and outputs
(Amiriparian et al., 2017). This type of architecture
will be tested when developing models, as the results
from (Cho et al., 2014), (Bengio et al., 2015),
(Amiriparian et al. 2017) and (Patilkulkarni &
Lakshmi, 2013) have shown to improve performance.
2.5 CNN-LSTM
CNN-LSTM networks are neural networks that
combine elements of CNNs (mainly CNN and
pooling layers) with elements of LSTM networks
(mainly LSTM and flatten layers). CNN-LSTM
networks are used in classification problems as they
provide advantages of both CNNs and LSTMs,
namely the spatial feature extraction ability of CNNs
and the temporal sequential learning ability of
LSTMs (Mohan, Soman & Vinayakumar, 2017).
A comparison of the performance of several
models that were CNN, RNN, identity recurrent
neural network (I-RNN), LSTM, GRU and CNN-
LSTM based, when classifying synthetic and real-
time PQDs, can be found in (Mohan et al., 2017). The
synthetic data contained eleven different classes
which were both single and combined disturbance
types, whereas real-time data contained only three
classes (Mohan et al., 2017). Results showed that for
synthetic data, the CNN-LSTM model have the
highest overall accuracy of 98.4% (Mohan et al.,
2017). Only the CNN-LSTM model was tested on
real-time data and it achieved an accuracy of 91.9%
(Mohan et al., 2017). These results suggest a CNN-
LSTM model can perform accurate classification of
synthetic and real-time PQDs. A batch size of 32 and
1000 epochs were proposed (Mohan et al., 2017).
The performance of a CNN-LSTM model when
classifying electrocardiogram (ECG) signals into five
different classes for automatic arrythmia diagnosis
was studied in (Oh, Ng, Tan & Acharya, 2018).
Results showed that the hybrid model performed with
98.1% accuracy (Oh et al., 2018). This result supports
the claim that adoption of a CNN-LSTM model can
result in accurate classification performance.
Although ECG signals differ to PQ signals, ECG
signals are still voltage measurements but taken in the
heart, and both data types are periodical. Architecture
of the CNN-LSTM proposed in (Oh et al., 2018) is
described with good detail. A batch size of ten was
chosen and the model was trained for 150 epochs.
In (Garcia et al., 2020), a CNN-LSTM model was
used to classify five different PQDs using voltage
waveforms as training and testing data and achieved
a maximum accuracy of 84.76%.
Based on the literature review, the following
networks have been identified as successful for the
identification of PQDs: CNN autoencoder with
LSTM and CNN with LSTM autoencoder.
Therefore, the networks above were adopted for
testing with PQD signals. In the following sections,
tests were carried out using CNN and LSTM.
Additional networks are at the moment under
development and will be presented in future work.
3 METHODOLOGY
The proposed approach is comprised of two steps.
Step 1 is the collection and pre-processing of data.
Step 2 refers to the development of two models using
Design of Experiments (DOE) and suggestions from
the literature.
3.1 Step 1: Data Collection &
Pre-processing
This work uses real data recorded by PQube power
quality monitors. The data can be accessed online and
is openly available (Power Standards Lab, 2019).
Each sample contains three-phase voltage data (L1-
N, L2-N and L3-N) for varying numbers of time-
steps, accompanied by the label for the type of PQD
present. The source website contains PQD data from
numerous PQube meters located around the world.
Different meters had different software versions
and were recording data for different types of
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
376
electrical systems, which meant each meter had a
different range of disturbances. In addition, some
meters recorded voltages in L-N format, some in L-L
format, and some only recorded voltages in one or
two phases rather than three. The dataset involved
with this work was retrieved from the PQube map
website (Power Standards Lab, 2019) and is
summarised in Table 3.
Table 3: Number of samples for each class.
Class Number of Samples
Snapshot (Normal) 976
Voltage Sag/Dip 315
Voltage Swell 275
Interruption 123
Total 1689
After data was imported, it was split into training
and testing data using the Python ‘train_test_split’
function. A 65:35 train-test split was adopted as ratios
of 80:20, 75:25 and 70:30 were experimented with and
gave poorer results. This was likely due to 65% of the
data being enough for the model to learn about it and
predict classes, whereas any higher percentage resulted
in the model learning the training data too well and
therefore performing poorly on test data (overfitting).
As shown in Table 3, there is a significant
imbalance between the number of samples of each
class, with the snapshot class being a large majority
class: of the 1689 samples, 976 (approx. 57.8%) are
snapshots. If the data was trained and tested on with
this imbalance, it could potentially reduce model
performance, because any model could learn that it
can achieve this as an accuracy just by classing every
sample as a snapshot, which is undesirable (Towards
Data Science, 2019). Therefore, to solve this problem,
oversampling was performed to balance the number
of samples of each class. Several oversampling
options existed, and multiple methods were
attempted. The RandomOverSampler function was
chosen and works by choosing random samples of the
minority class or classes and duplicating them until
classes are balanced (Towards Data Science, 2019).
3.2 Step 2: Model Training & DOE
Optimisation
Initially, elements from the models proposed in
(Mohan et al., 2017), (Seyfioğlu et al., 2018) and
(Balouji & Salor, 2017) were combined to produce a
convolutional autoencoder with LSTM model named
CNN(AE)-LSTM. This achieved an average accuracy
of 93.8% over ten runs. As suggested by (Mehdiyev
et al., 2017), (Cho et al., 2014) and (Bengio et al.,
2015), the next model developed replaced the
convolutional autoencoder element of the CNN(AE)-
LSTM model with a normal CNN element, and
replaced the LSTM element with a LSTM
autoencoder. This model was named CNN-
LSTM(AE). This achieved an average accuracy of
96.6% over ten runs.
The second aim of Step 2 was to find the optimal
model parameters which was achieved using
orthogonal arrays. Five factors were chosen for
optimisation, namely: number of convolutional
filters, convolutional and max-pooling strides,
dropout rate, number of LSTM memory blocks and
max-pooling filter size.
Each factor had three different levels, taken
mostly from the literature. The exceptions were a
CNN filter combination of 16, 8, 4, 8, 16, stride of
two, a dropout rate of 0.7 and a max-pooling filter size
of four. These were chosen experimentally for
convenience and are not informed from the literature.
All settings are summarised in Table 4 and Table 5.
Note that the convolutional filter sizes were not
changed as previous work (Balouji & Salor, 2017),
(Mohan et al., 2017) and (Seyfioğlu et al., 2018)
agreed three was the best. Orthogonal arrays applied
were L
27
(3
5
). For the two optimised models, time per
epoch was compared.
Table 4: Parameters and levels chosen for the orthogonal
array for the CNN(AE)-LSTM model.
Parameter Setting 1 Setting 2 Setting 3
No. of Conv
Filters
8, 4, 2, 4,
8
16, 8, 4,
8, 16
32, 16,
8, 16, 32
Conv & Pooling
Strides
1 2 3
Dropout Rate 0.3 0.5 0.7
LSTM Memory
Blocks
20 50 128
Max-Pooling
Filter Size
2 3 4
Table 5: Parameters and levels chosen for the orthogonal
array for the CNN-LSTM(AE) model.
Parameter Setting 1 Setting 2 Setting 3
No. of Conv
Filters
8 16 32
Conv &
Pooling
Strides
1 2 3
Dropout Rate 0.3 0.5 0.7
LSTM
Memory
Blocks
27,15, 8,
15, 27
62, 32, 8,
32,62
128, 64, 32,
64, 128
Max-Pooling
Filter Size
2 3 4
A Novel Deep Learning Power Quality Disturbance Classification Method using Autoencoders
377
4 RESULTS & DISCUSSION
All models were run ten times and overall accuracies
are reported in Table 6 which shows slightly better
accuracies overall for the CNN-LSTM(AE) model
than the CNN(AE)-LSTM model.
Also, no significant drops in accuracy exist, likely
because feature extraction of the LSTM autoencoder
has been more effective than the plain LSTM.
Table 6: Testing accuracies achieved for every run of every
model that initially had high performance during Step 2.
Accuracy (%)
Exp.
No.
CNN(AE)-LSTM CNN-LSTM(AE)
Test No.
1 95.6 96.7
2 93.5 96.1
3 96.5 98.2
4 95.8 96.8
5 97.2 97.1
6 83.2 97.4
7 96.0 94.8
8 93.7 96.9
9 92.3 95.2
10 94.8 96.9
Avg. 93.9 96.6
Std.
Dev.
4.0 1.0
The CNN-LSTM(AE) model in Table 7 shows
that using less convolutional and pooling strides
resulted in better performance. At first glance, this is
not supported by the CNN(AE)-LSTM model.
However, the CAE element of the CNN(AE)-LSTM
model uses four convolutional layers and two max-
pooling layers, meaning that using a lower stride was
much more computationally demanding causing
memory failures at one stride. Using two strides
worked but this did not appear as the optimal stride
number because three strides was tested with other
parameters at better settings.
No trends were found regarding the number of
LSTM memory blocks used, which appeared to have
less impact on testing accuracy than CNN-based
parameters. Setting max-pooling and up-sampling
filter sizes to two was shown to consistently be the
best option for this application. This conclusion is
aligned with the suggestions in the literature (Mohan
et al., 2017), (Seyfioğlu et al., 2018), (Liu et al.,
2017). In addition, lower max-pooling filter sizes
mean less smoothing, so more data is preserved. Up-
sampling size is the number of times each sample is
magnified (Medium, 2018).
The LSTM autoencoder model surpassed the
model using a normal LSTM layer. LSTM
autoencoders were more effective than normal LSTM
elements as LSTM autoencoders learn about data
more thoroughly.
Table 7: Summary of the settings leading to the best
experimental results for each model, with average
accuracies achieved and standard deviation values.
Model CNN(AE)-LSTM
CNN-
LSTM(AE)
Trial 1 8.003 35.014
Trial 2 5.002 32.013
Trial 3 5.002 32.012
Average 6.002 33.013
Table 8 shows time taken for each model to run
one epoch. Three epoch trials were run for each
model and then an average was taken to minimise
error, as there was some variation between each
execution. Results show employing LSTM
autoencoders required four to five times the time per
epoch as the non-LSTM autoencoder model. Also, the
CNN(AE)-LSTM was fast but less accurate than the
CNN-LSTM(AE) model which was quite slow and
gave moderately good accuracy.
Table 8: Average times achieved by each model's best
experimental set-up in seconds.
Model
CNN(AE)-
LSTM
CNN-
LSTM(AE)
No. of Conv
Filters
16, 8, 4, 8, 16 8
Conv &
Pooling Strides
3 1
Dropout Rate 0.3 0.3
LSTM Memory
Blocks
50 27, 15, 8, 15, 27
Max-Pooling &
Up-Sampling
Filter Size
2 2
Average
Accuracy
0.952 0.979
Standard
Deviation
0.012 0.006
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
378
5 CONCLUSIONS
In this paper two deep learning models for predicting
PQDs have been proposed and tested, namely
CNN(AE)-LSTM and CNN-LSTM(AE) These
models achieved accuracies of 95.153%±0.012 and
97.894%±0.006%, respectively.
The CNN-LSTM(AE) achieved great accuracy
but it was relatively slow, whilst the CNN(AE)-
LSTM achieved poorer accuracy but was much
quicker per epoch.
For the model optimisation step it was found that
one stride was more accurate but more
computationally demanding, affecting memory usage
the most. Larger filter sizes and strides caused lower
accuracy due to lower resolution of data captured by
filters, whilst more convolutional filters resulted in
higher accuracies.
Generally, a dropout rate of 0.3 was the best.
CNN layers appeared to be more computationally
demanding and more effective than LSTM layers,
possibly because CNN layers are generally used with
images and filters pixel values in a matrix (similar to
the PQube data), unlike LSTM layers that are
generally used with sequences. Accuracy shared no
relationship with LSTM memory blocks or
decomposition level.
The CNN-LSTM(AE) exceeded performance of
models in the literature (Bagheri et al., 2018),
(Balouji et al., 2018), (Garcia et al., 2020), (Uyar et
al., 2008), (Abdel-Galil et al., 2004), of which some
worked with synthetic data and others worked with
real data, whilst the CNN(AE)-LSTM exceeded some
of these (Balouji et al., 2018), (Garcia et al., 2020),
(Uyar et al., 2008), (Abdel-Galil et al., 2004) The next
steps of this research will consist of further
development of the proposed model and in testing its
accuracy in detecting other PQDs.
6 COPYRIGHT FORM
For this paper, the authors provide SCITEPRESS
Consent to Publish and Transfer of Copyright.
ACKNOWLEDGEMENTS
The authors would like to acknowledge the M2A
funding from the European Social Fund via the Welsh
Government (c80816) and the Engineering and
Physical Sciences Research Council (G. Todeschini:
Project EP/T013206/1; Dr Giannetti: Project
EP/S001387/1).
REFERENCES
Abdel-Galil, T.K., Kamel, M., Youssef, A.M. & El-
Saadany, E.F. & Salama, M.M.A. (2004). Power
Quality Disturbance Classification Using the Inductive
Inference Approach. IEEE.
Amiriparian, S., Freitag, M., Cummins, N. & Schuller, B.
(2017). Sequence to Sequence Autoencoders for
Unsupervised Representation Learning from Audio.
ResearchGate.
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C. &
Baskurt, A. (2011). Sequential Deep Learning for
Human Action Recognition. Springer.
Baggini, A. (2008). Handbook of Power Quality. Wiley.
Bagheri, A., Gu, I.Y.H. & Bollen, M.H.J. (2018). A Robust
Transform-Domain Deep Convolutional Network for
Voltage Dip Classification. IEEE.
Balouji, E. & Salor, O. (2017). Classification of Power
Quality Events Using Deep Learning on Event Images.
IEEE.
Balouji, E., Gu, I.Y.H., Bollen, M.H.J., Bagheri, A., Nazari,
M. (2018). A LSTM-based Deep Learning Method with
Application to Voltage Dip Classification. IEEE.
Bengio, S., Vinyals, O., Jaitly, N. & Shazeer, N. (2015).
Scheduled Sampling for Sequence Prediction with
Recurrent Neural Networks. arXiv.
Bollen, M.H.J. (2003). What is power quality? Elsevier.
Bravo-Rodriquez, J.C., Torres, F.J. & Borrás, M.D. (2020).
Hybrid Machine Learning Models for Classifying
Power Quality Disturbances: A Comparative Study.
MDPI.
Cho, K., Merrienboer, B.V., Gulcehre, C. & Bougares, F.
(2014). Learning Phrase Representations using RNN
Encoder – Decoder for Statistical Machine Translation.
arXiv.
D2L. (Date Unknown) Padding and Stride.
Demirci, T., Kalaycioglu, A., Kucuk, D., Salor, O., Guder,
M., Pakhuylu, S. et al. (2011). Nationwide real-time
monitoring system for electrical quantities and power
quality of the electricity transmission system. IEEE.
Garcia, C.I., Grasso, F., Luchetta, A., Piccirilli, M.C.,
Paolucci, L. & Talluri, G. (2020). A Comparison of
Power Quality Disturbance Detection and
Classification Methods Using CNN, LSTM and CNN-
LSTM. MDPI.
Geng, J., Fan, J., Wang, H., Ma, X., Li, B. & Chen, F.
(2015). High-Resolution SAR Image Classification via
Deep Convolutional Autoencoders. IEEE.
Goodfellow, I. & Bengio, Y. (2015). Deep learning. MIT
Press.
IEEE Standards Association. (2019). IEEE 1159-2019 -
IEEE Recommended Practice for Monitoring Electric
Power Quality. IEEE.
Jordan, M.I. & Mitchell, T.M. (2015). Machine learning:
Trends, perspectives, and prospects. ScienceMag.
A Novel Deep Learning Power Quality Disturbance Classification Method using Autoencoders
379
Kalchbrenner, N., Grefenstette, E. & Blunsom, P. (2014).
A Convolutional Neural Network for Modelling
Sentences. arXiv.
Katić, V.A. & Stanisavljević, A.M. (2018). Smart Detection
of Voltage Dips Using Voltage Harmonics Footprint.
IEEE.
Liu, Q., Zhou, F., Hang, R. & Yuan, X. Bidirectional-
Convolutional LSTM Based Spectral-Spatial Feature
Learning for Hyperspectral Image Classification.
(2017). arXiv.
Medium. (2018). Basic Overview of Convolutional Neural
Network (CNN).
Mehdiyev, N., Lahann, J., Emrich, A., Enke, D., Fettke, P.
& Loos, P. (2017). Time Series Classification using
Deep Learning for Process Planning: A Case from the
Process Industry. ScienceDirect.
Mohan, N., Soman, K.P. & Vinayakumar, R. (2017). Deep
Power: Deep Learning Architectures for Power Quality
Disturbances Classification. IEEE.
O’Shea, K. & Nash, R. (2015). An Introduction to
Convolutional Neural Networks. arXiv.
Oh, SL., Ng, E.Y.K., Tan, R.S. & Acharya, U.R. (2018).
Automated diagnosis of arrhythmia using combination
of CNN and LSTM techniques with variable length
heart beats. ScienceDirect.
Palaz, D., Collobert, R. & Magimai-Doss M. (2013).
Estimating Phoneme Class Conditional Probabilities
from Raw Speech Signal using Convolutional Neural
Networks. arXiv.
Patilkulkarni, S. & Lakshmi H.C.V. (2013). Vanishing
Moments of a Wavelet System and Feature Set in Face
Detection Problem for Color Images. ResearchGate.
Power Standards Lab. (2019). PQube - Live World Map of
Power Quality.
Seyfioğlu, M.S., Özbayoğlu, A.M. & Gürbüz S.Z. (2018).
Deep convolutional autoencoder for radar-based
classification of similar aided and unaided human
activities. IEEE.
Towards Data Science. (2019). A Deep Dive Into
Imbalanced Data: Over-Sampling.
Uyar, M., Yildirim, S. & Gencoglu, M.T. (2008). An
effective wavelet-based feature extraction method for
classification of power quality disturbance signals.
ScienceDirect.
Wang, S. & Chen, H. (2019). A novel deep learning method
for the classification of power quality disturbances
using deep convolutional neural network. Elsevier.
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
380