Image Edge Enhancement for Effective Image Classiﬁcation

Bu Tianhao

, Michalis Lazarou

and Tania Stathaki

Glory Engineering & Tech Co., LTD, China

Imperial College London, U.K.

Keywords:

Data Augmentation, Image Classiﬁcation, High Boost Filtering, Edge Enhancement.

Abstract:

Image classiﬁcation has been a popular task due to its feasibility in real-world applications. Training neural

networks by feeding them RGB images has demonstrated success over it. Nevertheless, improving the classi-

ﬁcation accuracy and computational efﬁciency of this process continues to present challenges that researchers

are actively addressing. A widely popular embraced method to improve the classiﬁcation performance of

neural networks is to incorporate data augmentations during the training process. Data augmentations are

simple transformations that create slightly modiﬁed versions of the training data, and can be very effective

in training neural networks to mitigate overﬁtting and improve their accuracy performance. In this study, we

draw inspiration from high-boost image ﬁltering and propose an edge enhancement-based method as means to

enhance both accuracy and training speed of neural networks. Speciﬁcally, our approach involves extracting

high frequency features, such as edges, from images within the available dataset and fusing them with the

original images, to generate new, enriched images. Our comprehensive experiments, conducted on two dis-

tinct datasets—CIFAR10 and CALTECH101, and three different network architectures—ResNet-18 ,LeNet-5

and CNN-9—demonstrate the effectiveness of our proposed method.

1 INTRODUCTION

Deep learning has undeniably been at the forefront of

technological advancement over the past decade, with

a transformative inﬂuence, particularly in the ﬁeld of

computer vision (He et al., 2016). However, training

neural networks effectively for computer vision tasks

poses many limitations, because of the reliance on

substantial amount of data to avoid overﬁtting(Zhang

et al., 2019).

Data augmentation is a widely recognized tech-

nique for mitigating model overﬁtting and enhancing

generalization, effectively addressing the issue of in-

sufﬁcient sample diversity(Shao et al., 2023). A com-

mon practice within the realm of data augmentation

is to apply colour space and/or other geometric trans-

formations to the original images in order to increase

the size and diversity of the training dataset. Some

notable works of data augmentation techniques in-

clude image mix-up approaches, such as the creation

of new images by directly merging two images from

distinct classes (Zhang et al., 2017), replacing speciﬁc

regions of an image with patches from other images

(Yun et al., 2019) and various other inventive varia-

tions (Liu et al., 2022), (Yin et al., 2021), (Kim et al.,

2020). Other approaches utilize the frequency domain

information (Chen and Wang, 2021), (Mukai et al.,

2022) with a focus on generating new sets of images

from diverse sources.

In our work, we present a novel data augmentation

method that transforms the original images by merg-

ing them with their corresponding extracted edges,

that we will refer to as Edge Enhancement (E

). It

operates on the same principle as high boost image

ﬁltering, a technique that accentuates the edges of

an image without completely eliminating the back-

ground. The proposed E

method extracts the edges

(high frequency components) from each of the RGB

channels of the original images and subsequently in-

tegrates these extracted features with the original im-

age to yield the edge-enhanced images. E

adheres to

the fundamental concept of introducing subtle modi-

ﬁcations by utilizing these extracted features to gen-

erate new images, all without relying on complex al-

gorithms, thus ensuring efﬁcient preprocessing.

The main contributions of our work are summa-

rized as follows:

1. We propose a novel data augmentation method

that enhances the semantic information of every

image in the training dataset which in turn im-

proves the training of neural networks.

2. Our method has undergone rigorous evaluation

on two distinct datasets, CIFAR10 and CAL-

TECH101, employing neural network models in-

444

Tianhao, B., Lazarou, M. and Stathaki, T.

Image Edge Enhancement for Effective Image Classiﬁcation.

DOI: 10.5220/0012364900003660

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2024) - Volume 3: VISAPP, pages

444-451

ISBN: 978-989-758-679-8; ISSN: 2184-4321

Figure 1: Edge enhancement process. Leftmost: original image. Mid left: corresponding R,G,B channels of the original

labelled data. Mid right:corresponding extracted edges Rightmost: edge enhanced image.

cluding CNN-9, LeNet-5 and RESNET-18. These

experiments have consistently showcased sub-

stantial improvements in classiﬁcation accuracy,

accompanied by a reduced requirement for train-

ing epochs to achieve optimal classiﬁcation accu-

racy.

2 BACKGROUND

2.1 Data Augmentation

Data augmentation is a technique that expands dataset

sizes by introducing subtle alterations.

The simplest data augmentation strategies are

through applying geometric transformations as rota-

tion and ﬂipping (Chlap et al., 2021). Our research

aligns with this principle of generating new labeled

data by modifying the original dataset. However, it

deviates from the conventional methods of rotation

and ﬂipping. Instead, we propose a novel approach

that prioritizes the enhancement of semantic informa-

tion. This innovative method draws inspiration from

the concept of high-boost ﬁltering, which will be elab-

orated upon in the subsequent section.

2.2 High Boost Filtering

High-boost ﬁltering is a well-established image pro-

cessing technique, representing a variation of high-

pass ﬁltering. In contrast to high-pass ﬁltering, high-

boost ﬁltering aims to accentuate an image’s high-

frequency information while preserving the back-

ground. Let’s denote an image undergoing this pro-

cess as I(u,v) (Srivastava et al., 2009), where u,v cor-

respond to spatial coordinates. High boost ﬁltering

can be formally deﬁned as follows:

boost

(u,v) = A · I(u,v) − I

low

(u,v) (1)

where I

boost

(u,v) is the high boosted image pixel

at location u, v, and A is a scaling factor greater than

1 (common choices are 1.15, 1.2) and I

low

(u,v) is the

low pass ﬁltered image pixel. Respectively, the high

pass ﬁltering can be presented in a similar way as,

high

(u,v) = I(u,v) − I

low

(u,v) (2)

where I

high

(u,v) is the high pass ﬁltered image pixel

at location u, v. Our method employs the concept of

high-boost ﬁltering, where we combine the high-pass

ﬁltered image with the original version, with A = 2.

This selection for A can be viewed as a particular in-

stance of high-boost ﬁltering.

2.3 Canny Edge Detection

Canny edge detection algorithm was invented by John

F Canny in 1986 (Canny, 1986). The procedure of

canny edge detection can be shown as follows: 1)

Noise suppression within the image is accomplished

by employing a Gaussian ﬁlter (a low-pass ﬁlter)

(Deng and Cahill, 1993). The kernel for the 2-D

Gaussian ﬁlter can be represented as:

G(u,v) =

2πσ

−

2σ

(3)

The denoised result is achieved by convolving the ker-

nel with the image, where σ

represents the ﬁlter vari-

ance. 2) Determine the edge strength ∇G and the four

orientations θ (horizontal, vertical and the two diago-

nals), expressed as

∇G = (

∂G

∂u

∂G

∂v

)

0.5

(4)

θ = tan

−1

(

∂G

∂v

∂G

∂u

) (5)

where

∂G

∂u

∂G

∂v

are the ﬁrst order derivatives (gradi-

ents) in the horizontal and vertical directions. 3)

Non-maximum suppression, which utilizes the edge

strength to suppress non-edge pixels. 4) Double

thresholding, a method which involves establishing

both high and low edge strength thresholds, which

Image Edge Enhancement for Effective Image Classiﬁcation

445

Figure 2: A batch of 4 images. Top row: original data. Mid row: corresponding extracted edges. Bottom row: corresponding

edge enhanced data.

are employed to categorize the strength of edge pix-

els into three levels: strong, weak, and non-edges.

Additionally, non-edge pixels are subsequently sup-

pressed. 5) Hysteresis edge tracking represents the

ﬁnal step, employing blob analysis to further han-

dle weak edges. Those weak edges which lack a

strong edge neighbor connection are identiﬁed as po-

tential noise or color variation-induced edges and sub-

sequently removed. In our work we use Canny edge

detection as a core technique to extract the edge in-

formation in each colour channel from the dataset of

interest.

3 METHODOLOGY

3.1 Problem Deﬁnition

Let us deﬁne a labeled image dataset D = (x

where x

represents the i

image and y

represents

the corresponding class label of image x

.This dataset

comprises the images, each containing three RGB

channels. The dataset is partitioned into three splits:

the training set split, D

train

= (x

), the valida-

tion set split, D

val

= (x

) and testing set split,

test

= (x

). We use D

train

to train a neural net-

work, that consists of a backbone f

and a classiﬁer

(last layer of the network). Additionally, we denote

the edge features extracted from the training data as

, this is merged with x

to create the edge enhanced

data e

The validation set D

val

is used in order to save the

model with the highest validation accuracy. Finally,

we use D

test

to calculate the test set classiﬁcation ac-

curacy.

3.2 Edge Enhancement

Our E

pre-processing can be illustrated in Figure 1,

where the leftmost image is the original data. Af-

ter applying Canny ﬁltering the extracted edges are

shown and the rightmost image is the edge enhanced

result. We ﬁrst apply normalization to the sample

images, and then resize the images to ensure all im-

ages have exactly the same resolution, also maintain-

ing consistent image size across the datasets. Canny

ﬁlters are employed to the every image at each of its

RGB channels to obtain the extracted edge informa-

tion. The extracted edge information is then com-

bined with their respective original versions, resulting

in a transformed image set. Each image in this set is

labeled according to the original class labels.

3.3 Training Phase

In each batch, we apply edge enhancement to a set

of training images, as depicted in Figure 2. In

this representation, the top four images represent the

original data, while the bottom four belong to the

edge-enhanced images, corresponding to the edge-

enhanced dataset

D =

y, where

x is the edge-

enhanced image and

y is the corresponding class label

of image

x. These edge-enhanced images are com-

bined with the original training set images to create

the input data for the network model. Consequently,

each batch sample comprises a total of eight images,

contributing to the generation of classiﬁcation results.

During the forward propagation, we extract the

features v

by inputting every image x

into the Canny

edge detection function, denoted as Canny(). These

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

446

Original data

Edge enhanced data

Edge Enhancement

Training

Classification

Classification results

class 1

class 2

Figure 3: The ﬂowchart of the edge enhancement method for model training. i) Original data is edge enhanced to the

transformed data. ii) The original and transformed data is concatenated then feed into the network. iii) Model training. iv)

Classiﬁcation results obtained through test dataset.

features can be expressed as follows:

:= Canny(x

) + x

(6)

During the training phase, the edge-enhanced and

original images are both used to train the network by

concatenating them together before feeding them to

the network. Therefore, every batch size now con-

tains both the edge enhanced images, {

}

i=1

, and the

original images, {x

}

i=1

, where b represents the batch

size. After, the concatenation we have {u

}

i=1

be-

cause the size of the new batch size is 2b. We feed

every image u

from the new batch to network f

obtain the feature vector z

that corresponds to image

, expressed as

:= f

)

(7)

We pass z

to g

so that

:= g

) (8)

where c

is the vector of logits of image i. We obtain

the probability of each class ˆp

by using the softmax

activation function

ˆp

exp(c

)

∑

i=1

exp(c

)

, (9)

where N is the total number of classes. With the aid of

equation (9), we calculate the cross-entropy loss L

:= −

∑

i∈N

log( ˆp

), (10)

where y

represents the one hot encoded version of y

Back propagation is then performed to calculate the

gradient of L

with respect to the network weight.

Algorithm 1 summarizes the training procedure of

, and the corresponding ﬂowchart is illustrated in

Figure 3

3.4 Inference Stage

During the inference stage, we calculate the proba-

bility vector, ˆp

, for each image x

in the test dataset,

test

by embedding image x

using the functions f

and equation (9). To predict the class label

image x

we employ the following formula:

:= arg max

j∈[N]

ˆp

i j

(11)

where N is the number of possible classes and the re-

sult corresponds to the position of the element with

the highest class probability in the class probability

vector p

We calculate the test dataset accuracy using the

following formula:

:= Acc(y

) (12)

where S

is the overall classiﬁcation accuracy, and

Acc is the accuracy function that sums up the numbers

of correct comparison results between the prediction

and true class labels, we denote the sum as N

correct

. To

scale them into percentage, the results are divided by

total number of images and multiplied by a hundred,

we express it as

correct

total

× 100.

4 EXPERIMENTS

4.1 Setup

Datasets. In our experiments two datasets are used:

1) CIFAR10, which is one of the most popular data

sets in image classiﬁcation research (Recht et al.,

2018). It contains a total of 60000 color images of

10 classes, each with dimensions of 32 by 32 pixels.

In our experiment we use 45000 images for training,

5000 images for validation and 10000 for testing.

2) CALTECH101, considered as one of the challeng-

ing datasets (Bansal et al., 2021), CALTECH101 fea-

tures higher resolution images in comparison to CI-

FAR10, as depicted in Figure 1,Figure 2, and Fig-

ure 3. This dataset contains 101 classes and includes

8,677 images. For consistency in feeding images into

the networks, we resized each image to 256 by 256

pixels and then center-cropped them to 224 by 224

Image Edge Enhancement for Effective Image Classiﬁcation

447

Algorithm 1: Edge enhancement training procedure.

input : training dataset D

train

= x

output : trained neural network model

batch size: 4; if

train

∈

CALTECH101 then

2 D

train

= Resize(D

train

, 256 x 256),

3 D

train

= CenterCrop(D

train

, 224 x 224)

4 D

train

= Normalization(

train

) while epoch not

ﬁnished do

5 for i <= D

train

.size() do

6 feature = Canny(x

,sigma = 3),

= add(x

,feature),

8 Inputs = concat(x

)

9 Model = Model.train(Inputs, D

val

)

pixels. Out of these images, 5,205 were used in the

training set, 579 for validation, and 2,893 for testing.

Implementation Details. Our implementation is

based in Python. We utilized the skimage package

for the Canny edge detector and Pytorch for training

our neural networks.

Networks. In our work, we used three differ-

ent neural network architectures tailored to the two

datasets to investigate the robustness of our method

with different networks and datasets. 1) LeNet-5:

This convolutional neural network (CNN) originally

proposed by LeCun in 1998(lec, 1998), was ini-

tially designed for hand written character classiﬁca-

tion. Given its compatibility with 32 by 32 pixel im-

ages and suitability for datasets with small class sizes

(10 in CIFAR10), we selected this network for the CI-

FAR10 dataset, due to its nature of accepting 32 by

32 images and suitable for small class sizes (10 for

CIFAR10). 2) CNN-9: A standard neural network

model with six convolutional layers and three linear

layers, again suitable for 32 by 32 images. 3) ResNet-

18: This network, proposed by (He et al., 2016) in

2016. It is designed for larger images of size 224

by 224 pixels. We used this network for the high-

resolution CALTECH101 dataset after preprocessing

the data to meet the network’s input requirements.

Hyperparameters. In the experiment, the batch

size is ﬁxed at 4, meaning 4 labeled images were en-

hanced with edge information simultaneously. We set

the standard deviation σ of the Canny edge detector to

3 and trained the CIFAR10 dataset with LeNet-5 and

CNN-9 for 50 epochs. For the CALTECH101 dataset,

we trained ResNet-18 for 100 epochs. The loss func-

0 10 20 30 40 50

Epochs

validation accuracy

edge-enhancement

original

Figure 4: CIFAR10 through CNN-9.

tion used is the cross-entropy loss due to its strong

performance in multi-class classiﬁcation tasks.(Zhou

et al., 2019). For weight optimization during network

training, we used Stochastic Gradient Descent (SGD)

with a momentum of 0.9, weight decay of 0.0005, and

learning rate of 0.01. We use the validation set to cal-

culate the validation accuracy and save the model with

the highest validation accuracy through comparisons

at each epoch.

4.2 Ablation Study

In our experimental setup, for each training dataset

and network architecture, we compared our edge en-

hancement method with the baseline method, which

involved training on the original data without any

modiﬁcations. We assessed this comparison through

two aspects: 1) Validation Accuracy Variation: We

monitored the validation accuracies for both meth-

ods throughout the training process, recording them

at each epoch and presenting the results in graph-

ical form. 2) Testing Accuracy: After identifying

the models that achieved the best validation results,

we tested them on the respective testing datasets to

obtain optimal classiﬁcation accuracies, which were

then presented in a tabular format.

1) CIFAR10: Figure 4 and Figure 5 display the results

obtained from both LeNet-5 and CNN-9 networks.

In these ﬁgures, the edge enhancement method is

represented by the blue lines, and it consistently

outperforms the original method indicated by the

red lines. The validation accuracy of the edge en-

hancement method surpasses that of the original at

nearly every epoch. Under LeNet-5, the edge en-

hancement method achieves its highest accuracy at

55.74%, differing from its lowest accuracy (40.05%)

by a substantial 15.69%. In contrast, the original

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

448

method exhibits a maximum-minimum difference of

15.24%. Similarly, with CNN-9, the edge enhance-

ment method shows a maximum-minimum differ-

ence of 18.86%, compared to the original method’s

18.62%. These results clearly indicate that the edge

enhancement method consistently provides greater

accuracy improvement during the training process.

Furthermore, our results reveal a noteworthy trend:

for both network architectures within the CIFAR10

dataset, the validation accuracy variations are notably

more stable and compact when using the edge en-

hancement method. CALTECH101: In Figure 6,

we present the results obtained through ResNet-18.

Here, the edge enhancement method demonstrates a

maximum-minimum difference of 44.73%, while the

original method yields a value of 40.61%. Once

again, these ﬁndings underscore the effectiveness of

our method in achieving more substantial accuracy

improvements.

2) In Table 1, we present the testing accuracy results,

where E

denotes the edge enhancement method, and

’epoch’ indicates the epoch at which the optimal

model was achieved. CIFAR10: For the CIFAR10

dataset employing LeNet-5, our method stands out

with an impressive 5.7% increase in accuracy over the

original approach. Moreover, it achieves this remark-

able performance while converging 12 epochs faster

to reach an optimal model. In the case of CNN-9 ap-

plied to CIFAR10, our method still shines, delivering

a substantial 4.03% accuracy improvement. Notably,

it accomplishes this while requiring 18 fewer epochs

to converge. These outcomes underscore the signif-

icant enhancements our method brings to both net-

work architectures when working with the CIFAR10

dataset.

CALTECH101: Through ResNet-18, our method

demonstrated a 1.56% improvement over the original

method, required 30 less epochs to converge. To con-

clude, the experiment results are sufﬁcient to show

that the proposed edge enhancement method for neu-

ral network training provide noticeable classiﬁcation

accuracy improvement and it is more computationally

efﬁcient.

4.3 Comparisons with Transformation

Methods

To further investigate the effectiveness of our edge en-

hancement method, we conducted a comprehensive

evaluation by integrating it with various geometric

transformation techniques. Speciﬁcally, we assessed

its performance in combination with transformations

like random horizontal ﬂipping and random image

size cropping. Our objective was to compare the out-

comes with a baseline approach that underwent the

same corresponding transformations. We refer to the

use of Random Cropping as C and Random Flipping

as F.

In this experiment, we use the pytorch built-in

transform functions and we split the comparisons

within each dataset into three types: 1. exclusively

applying random horizontal ﬂipping. 2. solely im-

plementing random image cropping . 3. simultane-

ously employing both random ﬂipping and random

cropping. The image classiﬁcation accuracy results

speciﬁed to each dataset section are compared, along

with the results obtained without any geometric trans-

formations, as shown in Table 1.

Random Horizontal Flip. The "random horizontal

ﬂip" operation involves ﬂipping the input data hori-

zontally with a speciﬁed probability. In our experi-

ment, we maintained a default probability of 50 per-

cent for applying this random ﬂipping to the input

data.

Random Image Crop. The "random image crop"

operation involves selecting a random location within

the given input data and then resizing the cropped por-

tion to a speciﬁed size. In our experiment, we adapted

the resizing process based on the dataset being used:

for CIFAR10, we resize the cropped data to 32 by 32

matching the original data size, suitable for LeNet-5

and CNN-9; for CALTECH101, since we use ResNet-

18 for this data set, we again resize the cropped data

to 224 by 224 ﬁtting the requirement.

As the results shown in Table 2, when applied

to the CIFAR10 dataset with network LeNet-5 our

edge enhancement method provides signiﬁcant ac-

curacy improvements over the baseline under every

transformation scenario. These improvements range

from 3.16% to 3.97% when compared to the base-

line. Notably, even when combined with random ﬂip-

ping, our method performs just 0.32% worse than

our method without any transformations, as detailed

in Table 1. Turning our attention to the Caltech101

dataset with ResNet-18, Table 3 showcases a sim-

ilar trend. Our edge enhancement method consis-

tently outperforms the original approach for all se-

lected transformation combinations, with improve-

ments ranging from 0.96% to 1.97%. In this case,

both edge enhancement with random cropping and

edge enhancement with both cropping and ﬂipping

surpass the unmodiﬁed method (with no transforma-

tions) from Table 1 in terms of performance. Further-

more, when considering the impact of these transfor-

Image Edge Enhancement for Effective Image Classiﬁcation

449

0 10 20 30 40 50

Epochs

validation accuracy

edge-enhancement

original

Figure 5: CIFAR10 through LeNet-5.

0 20 40 60 80 100

Epochs

validation accuracy

edge-enhancement

original

Figure 6: CALTECH101 through ResNet-18.

mation techniques, our results afﬁrm the claim that

edge enhancement substantially enhances image clas-

siﬁcation accuracy. Intriguingly, for both datasets us-

ing LeNet-5 and ResNet-18, the original method with

random cropping and ﬂipping demonstrates a faster

convergence to optimal performance compared to all

other results.

Table 1: Testing accuracy results.

METHOD

backbone accuracy% epoch

CIFAR10

Original LeNet-5 49.26 45

LeNet-5 54.96 33

Original CNN-9 72.61 43

CNN-9 76.64 25

CALTECH101

Original ResNet-18 72.83 86

ResNet-18 74.39 56

Table 2: CIFAR10 transformation testing accuracy results.

METHOD

backbone accuracy% epoch

CIFAR10

Original + F LeNet-5 50.67 45

+ F LeNet-5 54.64 32

Original + C LeNet-5 46.74 29

+ C LeNet-5 49.90 17

Original + F+C LeNet-5 44.59 16

+ F+C LeNet-5 48.54 49

Table 3: CALTECH101 transformation testing accuracy re-

sults.

METHOD

backbone accuracy% epoch

CALTECH101

Original + F ResNet-18 71.00 80

+ F ResNet-18 71.96 53

Original + C ResNet-18 79.23 88

+ C ResNet-18 81.20 58

Original + C+F ResNet-18 78.40 89

+ F+C ResNet-18 79.81 92

5 CONCLUSION

In our research, we introduced a novel approach for

enhancing image classiﬁcation accuracy and com-

putational efﬁciency. Our method, based on the

high boost ﬁltering principle, focuses on edge en-

hancement. This technique leverages the edge infor-

mation from labeled data to transform the original

dataset, thereby increasing both the size and diver-

sity of the training samples. Through extensive ex-

periments on popular datasets such as CIFAR-10 and

CALTECH-101, using well-known neural network ar-

chitectures including LeNet-5, CNN-9, and ResNet-

18, we have demonstrated the effectiveness of our

proposed method. Our results substantiate the validity

of our hypothesis. Regarding our future research di-

rections, we are interested in exploring the impact of

data augmentation techniques in conjunction with our

current state-of-the-art approach; additionally, scal-

ing our method to larger datasets such as ImageNet

and other domains as semi-supervised learning would

be another potential research direction. Our goal is

to further enhance classiﬁcation accuracy while at the

same time improving the training efﬁciency, pushing

the boundaries of what is achievable in image classi-

ﬁcation.

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

450

ACKNOWLEDGEMENTS

In the loving memory of TianHao Bu’s father

BingXin Bu: 10/01/1965 - 13/09/2023.

REFERENCES

(1998). Gradient-based learning applied to document recog-

nition. Proceedings of the IEEE, 86(11):2278–2324.

Bansal, M., Kumar, M., Sachdeva, M., and Mittal, A.

(2021). Transfer learning for image classiﬁcation us-

ing vgg19: Caltech-101 image data set. Journal of

ambient intelligence and humanized computing, pages

1–12.

Canny, J. (1986). A computational approach to edge de-

tection. IEEE Transactions on pattern analysis and

machine intelligence, (6):679–698.

Chen, X. and Wang, G. (2021). Few-shot learning by inte-

grating spatial and frequency representation. In 2021

18th Conference on Robots and Vision (CRV), pages

49–56. IEEE.

Chlap, P., Min, H., Vandenberg, N., Dowling, J., Holloway,

L., and Haworth, A. (2021). A review of medical im-

age data augmentation techniques for deep learning

applications. Journal of Medical Imaging and Radia-

tion Oncology, 65(5):545–563.

Deng, G. and Cahill, L. (1993). An adaptive gaussian ﬁl-

ter for noise reduction and edge detection. In 1993

IEEE conference record nuclear science symposium

and medical imaging conference, pages 1615–1619.

IEEE.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-

ual learning for image recognition. In Proceedings of

the IEEE conference on computer vision and pattern

recognition, pages 770–778.

Kim, J.-H., Choo, W., and Song, H. O. (2020). Puzzle

mix: Exploiting saliency and local statistics for op-

timal mixup. In International Conference on Machine

Learning, pages 5275–5285. PMLR.

Liu, Z., Li, S., Wu, D., Liu, Z., Chen, Z., Wu, L., and Li,

S. Z. (2022). Automix: Unveiling the power of mixup

for stronger classiﬁers. In European Conference on

Computer Vision, pages 441–458. Springer.

Mukai, K., Kumano, S., and Yamasaki, T. (2022). Im-

proving robustness to out-of-distribution data by

frequency-based augmentation. In 2022 IEEE In-

ternational Conference on Image Processing (ICIP),

pages 3116–3120. IEEE.

Recht, B., Roelofs, R., Schmidt, L., and Shankar, V. (2018).

Do cifar-10 classiﬁers generalize to cifar-10? arXiv

preprint arXiv:1806.00451.

Shao, S., Wang, Y., Liu, B., Liu, W., Wang, Y., and Liu,

B. (2023). Fads: Fourier-augmentation based data-

shunting for few-shot classiﬁcation. IEEE Transac-

tions on Circuits and Systems for Video Technology.

Srivastava, R., Gupta, J., Parthasarthy, H., and Srivastava,

S. (2009). Pde based unsharp masking, crispening

and high boost ﬁltering of digital images. In Contem-

porary Computing: Second International Conference,

IC3 2009, Noida, India, August 17-19, 2009. Proceed-

ings 2, pages 8–13. Springer.

Yin, W., Wang, H., Qu, J., and Xiong, C. (2021). Batch-

mixup: Improving training by interpolating hidden

states of the entire mini-batch. In Findings of the Asso-

ciation for Computational Linguistics: ACL-IJCNLP

2021, pages 4908–4912.

Yun, S., Han, D., Oh, S. J., Chun, S., Choe, J., and Yoo,

Y. (2019). Cutmix: Regularization strategy to train

strong classiﬁers with localizable features. In Pro-

ceedings of the IEEE/CVF international conference

on computer vision, pages 6023–6032.

Zhang, H., Cisse, M., Dauphin, Y. N., and Lopez-Paz, D.

(2017). mixup: Beyond empirical risk minimization.

arXiv preprint arXiv:1710.09412.

Zhang, H., Zhang, L., and Jiang, Y. (2019). Overﬁtting

and underﬁtting analysis for deep learning based end-

to-end communication systems. In 2019 11th inter-

national conference on wireless communications and

signal processing (WCSP), pages 1–6. IEEE.

Zhou, Y., Wang, X., Zhang, M., Zhu, J., Zheng, R., and Wu,

Q. (2019). Mpce: a maximum probability based cross

entropy loss function for neural network classiﬁcation.

IEEE Access, 7:146331–146341.

Image Edge Enhancement for Effective Image Classiﬁcation

451