Investigation of Deep Neural Network Compression Based on Tucker

Decomposition for the Classiﬁcation of Lesions in Cavity Oral

Vitor B. L. Fernandes

1 a

, Adriano B. Silva

1 b

, Danilo C. Pereira

1 c

, S

ergio V. Cardoso

2 d

Paulo R. de Faria

3 e

, Adriano M. Loyola

2 f

, Tha

ına A. A. Tosta

4 g

, Leandro A. Neves

5 h

and Marcelo Z. do Nascimento

1 i

Faculty of Computer Science, Federal University of Uberl

andia, Brazil

Area of Oral Pathology, School of Dentistry, Federal University of Uberl

andia, Brazil

Department of Histology and Morphology, Institute of Biomedical Science, Federal University of Uberl

andia, Brazil

Science and Technology Institute, Federal University of S

ao Paulo, Brazil

Department of Computer Science and Statistics (DCCE), S

ao Paulo State University, Brazil

Keywords:

Oral Epithelial Dysplasia, Convolutional Neural Network, Tensors, Histological Image, Classiﬁer, Tucker

Decomposition.

Abstract:

Cancer in the oral cavity is one of the most common, making it necessary to investigate lesions that could

develop into cancer. Initial stage lesions, called dysplasia, can develop into more severe stages of the disease

and are characterized by variations in the shape and size of the nucleus of epithelial tissue cells. Due to

advances in the areas of digital image processing and artiﬁcial intelligence, diagnostic aid systems (CAD)

have become a tool to help reduce the difﬁculties of analyzing and classifying lesions. This paper presents

an investigation of the Tucker decomposition in tensors for different CNN models to classify dysplasia in

histological images of the oral cavity. In addition to the Tucker decomposition, this study investigates the

normalization of H&E dyes on the optimized CNN models to evaluate the behavior of the architectures in the

classiﬁcation stage of dysplasia lesions. The results show that for some of the optimized models, the use of

normalization contributed to the performance of the CNNs for classifying dysplasia lesions. However, when

the features obtained from the ﬁnal layers of the CNNs associated with the machine learning algorithms were

analyzed, it was noted that the normalization process affected performance during classiﬁcation.

1 INTRODUCTION

Oral cavity cancer is one type common accounting

for almost 50% of cases in the head and neck re-

gion (Wild et al., 2020). This highlights the impor-

tance of investigating lesions that may develop into

cancer. One of such lesions, known as dysplasia, is

characterized by changes in the shape and size of the

nuclei of epithelial cells (Kumar et al., 2009).

https://orcid.org/0009-0007-8230-8779

https://orcid.org/0000-0001-8999-1135

https://orcid.org/0000-0002-2694-4865

https://orcid.org/0000-0003-1809-0617

https://orcid.org/0000-0003-2650-3960

https://orcid.org/0000-0001-9707-9365

https://orcid.org/0000-0002-9291-8892

https://orcid.org/0000-0001-8580-7054

https://orcid.org/0000-0003-3537-0178

With advances in digital image processing and ar-

tiﬁcial intelligence, computer-aided diagnosis (CAD)

systems have become increasingly popular and have

reduced the challenges faced by healthcare profes-

sionals during tissue classiﬁcation (Belsare, 2012).

CAD systems encompass the stages of image en-

hancement, segmentation, feature extraction, and

classiﬁcation. In (Ferro et al., 2022), the authors

present the machine learning methods addressed for

the implementation of automated detection of poten-

tially malignant and malignant diseases of the oral

cavity. In recent years, these systems adopted deep

learning-based strategies, such as convolutional neu-

ral networks, to improve these stages. Despite their

relevant contributions, these systems are often im-

pacted by over-parameterization. This high number of

parameters can be optimized using tensor decompo-

sition techniques applied to the convolutional layers

516

Fernandes, V., Silva, A., Pereira, D., Cardoso, S., R. de Faria, P., Loyola, A., Tosta, T., Neves, L. and Z. do Nascimento, M.

Investigation of Deep Neural Network Compression Based on Tucker Decomposition for the Classiﬁcation of Lesions in Cavity Oral.

DOI: 10.5220/0012388700003660

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2024) - Volume 3: VISAPP, pages

516-523

ISBN: 978-989-758-679-8; ISSN: 2184-4321

kernels, aiming to reduce the total number of param-

eters in the network (Kim et al., 2016). The authors

in (Liu and Ng, 2022) pointed out that further research

is needed to investigate convolutional neural network

(CNN) model compression, in order to reduce param-

eter amount while maintaining the same accuracy as

the original model.

Another factor that can reduce the performance

of classiﬁcation methods used in CAD systems is re-

lated to the pre-processing stage (Ribeiro et al., 2018).

During the process of obtaining digital histological

images, slides are stained with hematoxylin–eosin

(H&E), in which the hematoxylin dye stains acid

structures in purple and the eosin dye stains the basic

ones in pink (Celis and Romero, 2015a). This process

can present non-uniformity in the distribution of dyes

along the tissue. The use of different ﬁxatives, digiti-

zation equipment, and differences in the slide storage

are examples of factors that lead to color variations

on these images (Tosta et al., 2019a). Therefore, ex-

ploring the impact of H&E dye normalization on his-

tological dysplasia tissues in the context of dysplasia

classiﬁcation of optimized CNN models remains an

ongoing challenge.

This paper presents an investigation of Tucker de-

composition in tensors of ResNet-18 and ResNet-50

CNN architectures for the classiﬁcation of dysplasias

in histological images of the oral cavity. Moreover,

this study investigates the normalization of H&E dyes

on optimized models to evaluate their behavior in the

classiﬁcation stage. At last, the feature extraction

stage was performed in non-normalized and color-

normalized images obtained from the previous convo-

lutional layer to assess the color-normalization impact

on the classiﬁcation stage using machine learning al-

gorithms. Thus, the main contributions are:

• Study of the Tucker decomposition technique for

use with ResNet architecture tensors for evalua-

tion in the classiﬁcation of dysplasia lesions;

• Investigation of the impact of color normalization

in dysplasia classiﬁcation using ResNet model;

• Analysis of the features from the global average

pooling layer, before the fully connected layer of

the ResNet models for, classifying dysplasia le-

sions using machine learning (ML) algorithms.

2 METHODOLOGY

Figure 1 shows the sequence of steps performed to

classify dysplasia lesions on the models investigated.

All the experiments carried out in this work were con-

ducted on a machine with an AMD Ryzen 5 3600XT

processor, GeForce RTX 2070 SUPER graphics card,

and 64GB of RAM.

2.1 Image Dataset

The dataset consists of 30 H&E-stained mice tongue

tissue sections previously submitted to a carcinogen

during two experiments carried out in 2009 and 2010.

These experiments were approved by the Ethics Com-

mittee on the Use of Animals under protocol num-

ber 038/09 at the Federal University of Uberl

andia,

Brazil.

The histological slides were digitized using the

Leica DM500 optical microscope with 400× magniﬁ-

cation. A total of 66 images were obtained and stored

in the TIFF format using the RGB color model with

a resolution of 2048×1536 pixels. Using the method-

ology described by (Lumerman et al., 1995), the im-

ages were classiﬁed between healthy and severe OED

images. From the images, 74 ROIs of size 450 ×

250 pixels were obtained for each class. Examples

of these ROIs can be seen in Figure 2.

2.2 H&E Stain Normalization

Color normalization is a process applied at the stage

of image processing aiming to reduce possible color

variations between samples that may arise during the

digitization and staining stages (Sha et al., 2017). In

literature, several techniques are described for color

normalization, such as those proposed by (Vahadane

et al., 2015) and (Tosta et al., 2019b).

In this work, the technique proposed by (Tosta

et al., 2019b) was employed. This technique was de-

veloped speciﬁcally to normalize H&E dyed histolog-

ical images. Hematoxylin stains the nuclei with pur-

ple color and eosin colors the cytoplasm and other

extracellular structures as pink (Celis and Romero,

2015b). However, the color obtained in the images

can undergo variations depending on other factors,

such as the way the image was digitized and how the

preparation was performed (Khan et al., 2014; Sethi

et al., 2016).

The adopted approach normalizes the image col-

ors while maintaining the histological structures and

ensuring that no artifacts are introduced (Tosta et al.,

2019b). The method achieves this result with an unsu-

pervised estimate of the sparsity parameter and stain

representation.

2.3 ResNet Architecture

The CNN architectures used in the experiments were

ResNet models. This architecture was proposed

Investigation of Deep Neural Network Compression Based on Tucker Decomposition for the Classiﬁcation of Lesions in Cavity Oral

517

Figure 1: Box diagram of the stages employed for the classiﬁcation of oral dysplasia tissue images.

(a) (b)

Figure 2: Examples of oral histological tissues: (a) healthy

tissue; (b) severe dysplasia.

in (He et al., 2015) and it is a deep convolutional neu-

ral network model based on the use of the so-called

residual learning. Residual learning consists of skip-

ping one or more layers, keeping the information in-

tact, and then applying it to the output of subsequent

layers. This approach helps to improve the perfor-

mance of the network over many layers by avoiding

the degradation problem that can occur with other ar-

chitectures.

The ResNet architecture is available in different

versions, with different numbers of layers. For this

study’s experiments, ResNet-18 and ResNet-50 mod-

els were chosen, which have 18 and 50 layers, respec-

tively. The choice of this architecture was motivated

by two reasons. The ﬁrst is that it has been widely

used in other studies of histological images, such

as the classiﬁcation of images from different body

parts (Talo, 2019), the performance of both ResNet-

18 and ResNet-50 in the classiﬁcation of colorectal

cancer images (Sarwinda et al., 2021) and research

that proposed the use of ResNet-50 for classiﬁcation

breast cancer images (Al-Haija and Adebanjo, 2020).

The second reason was due to its structure, which is

mostly composed of convolutional layers. As decom-

position techniques are applied only to convolutional

layer tensors, using a network that has relatively few

parameters in densely connected layers ensures that

the decomposition is more expressive.

2.4 Tucker Decomposition

Tensor is a concept used primarily in computing as a

generalization of a matrix to dimensions greater than

three. A tensor with one dimension is typically called

a vector. A tensor with two dimensions is called a

matrix. From three dimensions, the used term sim-

ply becomes tensor and can refer to an Nth-order ten-

sor, with N being the number of dimensions present

in that tensor (Kolda and Bader, 2009). Tensors play

an important role in several areas of computing, be-

ing mainly used in signal processing techniques, ML,

clustering and dimensionality reduction algorithms,

and data mining (Sidiropoulos et al., 2017).

A tensor of many dimensions can undergo a series

of mathematical transformations to rearrange its for-

mat and result in more than one tensor while main-

taining an approximation of the information con-

tained in the original tensor. This operation is called

low-rank approximation (Kolda and Bader, 2009).

After applying this operation, the resulting tensors

have fewer parameters than the original one, resulting

in a reduction in dimensionality and the total num-

ber of parameters. An approximation of the original

tensor can be obtained from operations applied to the

resulting tensors (Cichocki et al., 2017). The quality

of these approximations is dependent on the rank val-

ues chosen at the decomposition time and, depending

on the situation, lower than ideal values can be used

to achieve greater compression, in which the approx-

imation does not need to be too precise (Kim et al.,

2016).

There are several techniques for tensor de-

composition, with the most popular being CP-

Decomposition and Tucker Decomposition (Kolda

and Bader, 2009). The method chosen for this study

was Tucker Decomposition, which decomposes an n-

dimensional tensor into n+1 matrices, one of which is

a nucleus. The dimensions size of these matrices is

based on the size of the dimensions from the orig-

inal tensor and the value deﬁned as decomposition

rank (Kim et al., 2016).

Equation 1 deﬁnes the Tucker decomposition for

a three-dimensional tensor X , whose dimensions have

values I

, I

and I

, therefore X ∈ R

×I

, and the

chosen rank values were R

, R

and R

X ≈ G ×

A ×

B ×

C (1)

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

518

After the decomposition, three matrices were ob-

tained, A ∈ R

×R

, B ∈ R

×R

, A ∈ R

×R

, and a

three-dimensional nucleus G ∈ R

×R

In our application, the reconstruction is not ap-

plied, since the goal is to maintain the approxima-

tion using a smaller number of parameters than the

original tensor, and the reconstruction would restore

the original dimensions. Even without reconstruc-

tion, performing convolutions consecutively on the

new tensors resulted in weights similar to that of the

original tensor. The rank values used in the experi-

ments were deﬁned empirically, being proportional to

0.50, 0.40, 0.30, 0.20, and 0.10 of the original values

of the tensor to be decomposed. These intervals allow

the evaluation of different network compression lev-

els and the consequences of using smaller and larger

rank values.

2.5 Machine Learning Algorithms

In these experiments, the data was extracted from the

global average pooling layer before the fully con-

nected layer and stored in feature vectors. Then,

the ML algorithms were used to evaluate the classi-

ﬁcation of feature vectors (Wright et al., 2016). At

this stage, the algorithms were implemented using the

scikit-learn machine learning library.

The decision tree (DT) algorithm is a machine

learning method that utilizes classiﬁcation rules and

analyzes data through a tree-like data structure.

This structure is often represented as a tree dia-

gram, as originally introduced by Quinlan (Quinlan,

1986). The random forest (RF) is an approach that

constructs extensive collections of random decision

trees to make predictions, as originally proposed by

Breiman (Breiman, 2001). This method involves

creating regression trees using bootstrapped samples

from a training dataset, with the additional twist of se-

lecting random features during the tree creation pro-

cess. The support vector machine (SVM) algorithm is

a machine learning model commonly employed in bi-

nary classiﬁcation tasks. It operates by mapping input

features into a multidimensional space, where it con-

structs a decision surface. Cortes and Vapnik (Cortes

and Vapnik, 1995) presented an implementation in

which it was possible to classify non-linear classes

using a larger dimensional space for data classiﬁca-

tion. Thus, the tests were performed with the SVM

using the polynomial kernel. The Naive Bayes (NB)

method is a machine learning classiﬁcation method

known for its simplicity and effectiveness. It lever-

ages Bayes’ theorem, which calculates the probability

of an event by considering prior knowledge of rele-

vant conditions (Mitchell, 1997). In the classiﬁcation

context, NB predicts the likelihood that a data point

belongs to a speciﬁc class based on its available fea-

tures and attributes

2.6 Experimental Evaluation

For the execution of the experiments, the image

dataset was classiﬁed in a binary way, using only im-

ages of healthy tissues and images of severe dysplasia.

The dataset was evaluated with and without normal-

ization techniques, to verify its impact on the network

accuracy after the decomposition and ﬁne-tuning pro-

cesses. Furthermore, for the classiﬁcation process,

the k-fold cross-validation was applied with k = 10.

Both the networks were trained for 500 epochs for

the two datasets (normalized and non-normalized).

The stochastic gradient descent (SGD) method was

used as an optimizer and the loss function employed

was the cross-entropy. The learning rate used in the

training stage was 0.001 for both models and datasets.

In this work data augmentation techniques were

used to contribute to the generalization process of the

networks. In the non-normalized image sets, the fol-

lowing were applied: i) random horizontal ﬂip; ii)

random vertical ﬂip; iii) rotation (max. 40º); iv) ran-

dom resized crop (0.80 to 0.90); v) auto contrast; vi)

sharpness; vii) colorjitter: viii) brightness; ix) con-

trast and saturation (0.70 to 1.30). In the normal-

ized dataset, the same operations and settings were

applied, except ColorJitter, which was not applied to

evaluate the colorization process on the images.

After training the original networks, the Tucker

decomposition was employed. The decomposition

operations were applied only to the convolutional lay-

ers. Since each layer has different tensor sizes, the

choice of the rank value was made proportionally,

deﬁning a value between 0 and 1, which was multi-

plied by the original size values of each tensor dimen-

sion. If the result is not an integer, it is rounded up.

To carry out the experiments, the following rank val-

ues were deﬁned: 0.50, 0.40, 0.30, 0.20, and 0.10 to

the original values for each dimension in each tensor.

These values were used in the ResNet-18 and ResNet-

50 models.

In addition to evaluating the performance of image

classiﬁcation using the fully connected layers investi-

gated models, the image dataset was also classiﬁed

using ML algorithms, to evaluate the color normal-

ization and compression of convolutional layer. The

models chosen for both (ResNet-18 and ResNet-50)

were the models with a rank ratio of 0.10 of the value

of the original tensor.

After choosing the models, the histological im-

ages were inputted into the convolutional layers of

Investigation of Deep Neural Network Compression Based on Tucker Decomposition for the Classiﬁcation of Lesions in Cavity Oral

519

both models, and the features were extracted without

passing them through the fully connected layers and

the Softmax function. After being extracted, these

features were used to train and test the four chosen

ML algorithms: DT, RF, SVM, and NB.

For the analysis of the results for classiﬁcation,

the value of accuracy was applied to indicate the

global performance of the model (from all classiﬁca-

tions, how much the models got right) (Martinez et al.,

2003).

In the process of evaluating the optimized mod-

els, in addition to the accuracy metric, the network

weights, the total number of parameters, the number

of parameters in the convolutional layers, and the time

spent on decomposing and ﬁne-tuning the network

were also evaluated.

3 RESULTS

3.1 Evaluation of the Classiﬁcation with

the CNN Models

Tables 1 and 2 present the results obtained with

the dysplasia non-normalized dataset and normalized

dataset. In Table 1, the ResNet-18 model had around

11 million parameters for the original backbone. Af-

ter the decomposition step, using a rank ratio of 0.50,

the number of parameters decreased to approximately

4 million, resulting in a reduction of around 2.66.

With the rank value 0.40 of the original, the number

of parameters was approximately 3 million, equiva-

lent to a reduction of 3.75 times. In the decomposition

using a rank of 0.30, the resulting network had under

2 million parameters, with a compression rate of 5.72.

Using a rank of 0.20, the reduction brings the number

of parameters down to 1.5 million, which is equiva-

lent to a network 9.70 times smaller than the original

network. With a rank of 0.10, the resulting model pro-

vided a number of only 500,000 parameters, which

is 19 times smaller than the size of the original net-

work. The decomposition made with a rank value of

0.50 maintained the accuracy of 100% in the image

set. The original model’s rank of 0.40 resulted in an

accuracy of 85.71%. With rank values of 0.30 and

0.20, the accuracy was 92.85%. Finally, the Turkey

decomposition using a rank value of 0.10 returned an

accuracy of 100%.

For the ResNet-50 model (see Table 1), the initial

number of parameters was approximately 23.5 mil-

lion. After being decomposed with a rank ratio value

of 0.50, the model resulted in an increase in the num-

ber of parameters to 28.5 million. Using rank 0.40,

the number of parameters obtained was around 22.6

million, which means that there was a compression of

parameters. This behavior was also observed at the

other rank levels. The original network achieved an

accuracy of 100% on the set of images without nor-

malization. This value was maintained for the rank

proportions 0.50, 0.40, 0.30 and 0.10. Only for the

model with rank of 0.20 was this value reduced to

85.71%.

Using the dataset of normalized images (see Ta-

ble 2), the decomposition of the CNN models used

the same values for the proportion of ranks, as de-

ﬁned in the Experimental evaluation section, regard-

less of which set of images, the number of optimiza-

tion parameters in each model was similar. In the

case of accuracy, the original ResNet-18 model re-

sulted in 100% for the set of normalized images. A

relevant point is that the normalized images allowed

the models with compression to improve their perfor-

mance for other rank values in relation to the original

images, resulting in an accuracy of 100%. However,

only for ResNet-50 with a rank of 0.40 did this value

degrade in relation to the network’s performance with

the original images (92.85%).

3.2 Investigation of CNN Features with

ML Algorithms

The compressed ResNet-50 and ResNet-18 models

with a rank of 0.10 provided relevant results with the

sets of images investigated with the smallest number

of parameters and the shortest processing time. Thus,

the features obtained from these models were evalu-

ated with ML algorithms.

Figures 3 (a) and 3 (b) show the accuracy val-

ues with the ML algorithms using the original CNN

and compressed backbone, respectively. Figure 3 (a)

shows that the classiﬁcation obtained with the DT

algorithm with original ResNet-18 features was not

able to achieve 100% accuracy on the non-normalized

dataset. For the other algorithms investigated on this

model, however, when the images were normalized,

this algorithm achieved an accuracy of 100%. For the

other algorithms, performance was similar between

the normalized data and the original data. Figure 3

(b) shows that some of the approaches reduce the re-

sults when using the features obtained from the com-

pressed model (DT and SVM).

Accuracy values with the ML algorithms using the

original ResNet-50 and compressed ResNet-50 are

presented in Figures 4 (a) and Figures 4 (b), respec-

tively. In the same way, some of the algorithms per-

formed similarly using the original CNN model (see

Figure 4 (a)). However, only the NB algorithm im-

proved performance after the normalization process.

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

520

Table 1: Metrics achieved with convolutional neural network models for a dataset of non-normalized histological images.

Original Image Dataset

Rank Accuracy (%) Parameters Compression Rate

Time to

Decompose (seconds)

Time to

Fine-tune (seconds)

ResNet-18

Original 100 11,177,538 1 - -

0.50 100 4,205,890 2.66 7.24 57.11

0.40 85.71 2,978,663 3.75 7.01 46.01

0.30 92.85 1,953,645 5.72 6.86 43.65

0.20 92.85 1,152,867 9.70 5.98 41.91

0.10 100 566,482 19.73 3.87 40.45

ResNet-50

Original 100 23,512,130 1 - -

0.50 100 28,566,722 0.82 13.08 116.71

0.40 100 22,638,820 1.04 12.41 108.04

0.30 100 17,078,229 1.38 11.66 98.55

0.20 85.71 11,933,648 1.97 10.37 87.22

0.10

100 7,200,064 3.27 6.34 76.70

Table 2: Values obtained with the metrics for the convolutional neural network models with the dataset of normalized histo-

logical images.

Normalized Image Dataset

Rank Accuracy (%) Parameters Compression Rate

Time to

Decompose (seconds)

Time to

Fine-tune (seconds)

ResNet-18

Original 100 11,177,538 1 - -

0.50 100 4,205,890 2.66 7.19 57.06

0.40 100 2,978,663 3.75 7.29 49.02

0.30 100 1,953,645 5.72 7.11 46.27

0.20 100 1,152,867 9.70 6.01 44.21

0.10 100 566,482 19.73 3.50 42.87

ResNet-50

Original 100 23,512,130 1 - -

0.50 100 28,566,722 0.82 13.18 126.61

0.40 92.85 22,638,820 1.04 12.69 113.54

0.30 100 17,078,229 1.38 11.64 103.14

0.20

100 11,933,648 1.97 10.29 93.52

0.10 100 7,200,064 3.27 6.52 80.03

In Figures 4 (b), the features obtained from the com-

pressed network model showed lower results when

the normalization process was applied to the images

for part of the classiﬁcation algorithms. The results

show that, although the use of stain normalization im-

proves classiﬁcation when applied only to CNN ar-

chitectures, the approach of associating features and

ML algorithms with data from compressed networks

degraded the performance of the dataset.

Table 3 presents a comprehensive summary of the

results obtained in contrast to the outcomes achieved

by relevant image processing techniques developed

for the examination of histopathological images of

dysplasia of the oral cavity. The results show that the

approach investigated contributes to the classiﬁcation

of histological lesions of the oral cavity to present a

reduced CNN model to assist in the diagnostic pro-

cess for specialists.

Table 3: Evaluation between proposed systems and classiﬁ-

cation methods in the literature with oral tissue dataset.

Study Feature Extraction Classiﬁer A

(Adel et al., 2018) ORB SVM 92.6

(Silva et al., 2022) CNN features HOP 98.0

(Deif et al., 2022) Learning feature XGBoost 96.3

(Neves et al., 2023) Learning feature CNN 97.9

Proposed Approach CNN feature Softmax 100

Investigation of Deep Neural Network Compression Based on Tucker Decomposition for the Classiﬁcation of Lesions in Cavity Oral

521

(a) Features obtained with original ResNet-18 model.

(b) Features extracted from the compressed ResNet-18

model.

Figure 3: Accuracy obtained with CNN features and ML

algorithms: (a) original model; (b) compressed model.

(a) Feature obtained with original ResNet-50 backbone.

(b) Feature obtained with compressed ResNet-50 model.

Figure 4: Comparison of accuracy between the ML algo-

rithms: (a) original model; (b) compressed model.

4 CONCLUSIONS

This work evaluated compressed ResNet architectures

obtained by using Tucker decomposition on convo-

lutional layer kernels. Furthermore, this work eval-

uated the impact on the accuracy of the networks

when using the color normalization method proposed

by (Tosta et al., 2019b) on the investigated dataset.

As shown in Tables 1 and 2, the networks that were

trained using our dataset achieved good accuracy,

even when decomposed using very small values for

the rank proportion, which resulted in signiﬁcant

compression of the networks, drastically reducing the

total number of the parameters. These results were es-

pecially positive when combined with color normal-

ization.

In this study, the classiﬁcation also was evalu-

ated with classic ML algorithms using the features

extracted from the compressed networks. The re-

sults, shown in Figures 3 and 4, indicate that the algo-

rithms maintain good performance in both networks

when used to classify non-normalized images. How-

ever, when color normalization was applied, the al-

gorithms demonstrated an accuracy drop, regardless

of the architecture used for feature extraction. Future

work will investigate other compression approaches

and CNN model architectures to evaluate the classiﬁ-

cation of histological lesions

ACKNOWLEDGMENT

This study was ﬁnanced in part by the Coordenac¸

de Aperfeic¸oamento de Pessoal de N

ıvel Supe-

rior - Brasil (CAPES) - Finance Code 001. The

authors gratefully acknowledge the ﬁnancial sup-

port of National Council for Scientiﬁc and Techno-

logical Development CNPq (Grants #313643/2021-

0, #311404/2021-9 and #307318/2022-2), the State

of Minas Gerais Research Foundation - FAPEMIG

(Grant #APQ-00578-18 and Grant #APQ-01129-21)

and S

ao Paulo Research Foundation - FAPESP (Grant

#2022/03020-1).

REFERENCES

Adel, D., Mounir, J., El-Shafey, M., Eldin, Y. A., El Masry,

N., AbdelRaouf, A., and Abd Elhamid, I. S. (2018).

Oral epithelial dysplasia computer aided diagnostic

approach. In 2018 13th International Conference on

Computer Engineering and Systems (ICCES), pages

313–318. IEEE.

Al-Haija, Q. A. and Adebanjo, A. (2020). Breast cancer

diagnosis in histopathological images using resnet-50

convolutional neural network. In 2020 IEEE Interna-

tional IOT, Electronics and Mechatronics Conference

(IEMTRONICS), pages 1–7.

Belsare, A. (2012). Histopathological image analysis using

image processing techniques: An overview. Signal &

Image Processing : An International Journal, 3:23–

36.

Breiman, L. (2001). Random forests. Machine Learning,

45(1):5–32.

Celis, R. and Romero, E. (2015a). Unsupervised color nor-

malisation for h and e stained histopathology image

analysis. In 11th International Symposium on Medical

Information Processing and Analysis, volume 9681,

pages 16–22. SPIE.

Celis, R. and Romero, E. (2015b). Unsupervised color nor-

malisation for h and e stained histopathology image

analysis. volume 9681. Cited by: 8.

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

522

Cichocki, A., Phan, A.-H., Zhao, Q., Lee, N., Oseledets,

I., Sugiyama, M., and Mandic, D. P. (2017). Tensor

networks for dimensionality reduction and large-scale

optimization: Part 2 applications and future perspec-

tives. Foundations and Trends® in Machine Learning,

9(6):431–673.

Cortes, C. and Vapnik, V. (1995). Support-vector networks.

Machine Learning, 20(3):273–297.

Deif, M. A., Attar, H., Amer, A., Elhaty, I. A., Khosravi,

M. R., Solyman, A. A., et al. (2022). Diagnosis of oral

squamous cell carcinoma using deep neural networks

and binary particle swarm optimization on histopatho-

logical images: an aiomt approach. Computational

Intelligence and Neuroscience, 2022.

Ferro, A., Kotecha, S., and Fan, K. (2022). Machine learn-

ing in point-of-care automated classiﬁcation of oral

potentially malignant and malignant disorders: a sys-

tematic review and meta-analysis. Scientiﬁc Reports,

12(1):13797.

He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep resid-

ual learning for image recognition.

Khan, A. M., Rajpoot, N., Treanor, D., and Magee, D.

(2014). A nonlinear mapping approach to stain

normalization in digital histopathology images using

image-speciﬁc color deconvolution. IEEE Transac-

tions on Biomedical Engineering, 61(6):1729 – 1738.

Cited by: 391; All Open Access, Bronze Open Ac-

cess.

Kim, Y.-D., Park, E., Yoo, S., Choi, T., Yang, L., and Shin,

D. (2016). Compression of deep convolutional neural

networks for fast and low power mobile applications.

Kolda, T. G. and Bader, B. W. (2009). Tensor decomposi-

tions and applications. SIAM Review, 51(3):455–500.

Kumar, V., Abbas, A., Fausto, N., and Aster, J. (2009). Rob-

bins & Cotran Pathologic Basis of Disease E-Book.

Robbins Pathology. Elsevier Health Sciences.

Liu, Y. and Ng, M. K. (2022). Deep neural network com-

pression by tucker decomposition with nonlinear re-

sponse. Knowledge-Based Systems, 241:108171.

Lumerman, H., Freedman, P., and Kerpel, S. (1995). Oral

epithelial dysplasia and the development of inva-

sive squamous cell carcinoma. Oral Surgery, Oral

Medicine, Oral Pathology, Oral Radiology, and En-

dodontology, 79(3):321–329.

Martinez, E. Z., Neto, F. L., and de Braganc¸a Pereira, B.

(2003). A curva roc para testes diagn

osticos.

Mitchell, T. M. (1997). Machine learning.

Neves, L. A., Martinez, J. M. C., Longo, L. H. d. C.,

Roberto, G. F., Tosta, T. A. A., Faria, P. R. d., Loy-

ola, A. M., Cardoso, S. V., Silva, A. B., Nascimento,

M. Z. d., et al. (2023). Classiﬁcation of h&e images

via cnn models with xai approaches, deepdream rep-

resentations and multiple classiﬁers. In Proceedings.

Quinlan, J. R. (1986). Induction of decision trees. Machine

Learning, 1(1):81–106.

Ribeiro, M. G., Neves, L. A., Roberto, G. F., Tosta, T. A.,

Martins, A. S., and Do Nascimento, M. Z. (2018).

Analysis of the inﬂuence of color normalization in the

classiﬁcation of non-hodgkin lymphoma images. In

2018 31st SIBGRAPI Conference on Graphics, Pat-

terns and Images (SIBGRAPI), pages 369–376. IEEE.

Sarwinda, D., Paradisa, R. H., Bustamam, A., and Anggia,

P. (2021). Deep learning in image classiﬁcation using

residual network (resnet) variants for detection of col-

orectal cancer. Procedia Computer Science, 179:423–

431. 5th International Conference on Computer Sci-

ence and Computational Intelligence 2020.

Sethi, A., Sha, L., Vahadane, A. R., Deaton, R. J., Ku-

mar, N., Macias, V., and Gann, P. H. (2016). Em-

pirical comparison of color normalization methods

for epithelial-stromal classiﬁcation in h and e images.

Journal of Pathology Informatics, 7(1):17.

Sha, L., Schonfeld, D., and Sethi, A. (2017). Color nor-

malization of histology slides using graph regularized

sparse NMF. In Gurcan, M. N. and Tomaszewski,

J. E., editors, Society of Photo-Optical Instrumen-

tation Engineers (SPIE) Conference Series, volume

10140 of Society of Photo-Optical Instrumentation

Engineers (SPIE) Conference Series, page 1014010.

Sidiropoulos, N. D., De Lathauwer, L., Fu, X., Huang, K.,

Papalexakis, E. E., and Faloutsos, C. (2017). Ten-

sor decomposition for signal processing and machine

learning. IEEE Transactions on Signal Processing,

65(13):3551–3582.

Silva, A. B., De Oliveira, C. I., Pereira, D. C., Tosta,

T. A., Martins, A. S., Loyola, A. M., Cardoso, S. V.,

De Faria, P. R., Neves, L. A., and Do Nascimento,

M. Z. (2022). Assessment of the association of deep

features with a polynomial algorithm for automated

oral epithelial dysplasia grading. In 2022 35th SIB-

GRAPI Conference on Graphics, Patterns and Images

(SIBGRAPI), volume 1, pages 264–269. IEEE.

Talo, M. (2019). Convolutional neural networks for multi-

class histopathology image classiﬁcation. ArXiv,

abs/1903.10035.

Tosta, T. A. A., de Faria, P. R., Neves, L. A., and do Nasci-

mento, M. Z. (2019a). Computational normalization

of h&e-stained histological images: Progress, chal-

lenges and future potential. Artiﬁcial intelligence in

medicine, 95:118–132.

Tosta, T. A. A., de Faria, P. R., Servato, J. P. S., Neves,

L. A., Roberto, G. F., Martins, A. S., and do Nasci-

mento, M. Z. (2019b). Unsupervised method for nor-

malization of hematoxylin-eosin stain in histological

images. Comput Med Imaging Graph, 77:101646.

Vahadane, A., Peng, T., Albarqouni, S., Baust, M., Steiger,

K., Schlitter, A. M., Sethi, A., Esposito, I., and Navab,

N. (2015). Structure-preserved color normalization

for histological images. In 2015 IEEE 12th Inter-

national Symposium on Biomedical Imaging (ISBI),

pages 1012–1015.

Wild, C., Stewart, B., Weiderpass, E., for Research on Can-

cer, I. A., and Weltgesundheitsorganisation (2020).

World Cancer Report: Cancer Research for Cancer

Prevention. International Agency for Research on

Cancer.

Wright, M. N., Ziegler, A., and K

onig, I. R. (2016). Do little

interactions get lost in dark random forests? BMC

bioinformatics, 17(1):145.

Investigation of Deep Neural Network Compression Based on Tucker Decomposition for the Classiﬁcation of Lesions in Cavity Oral

523