Automated Defect Detection in Ceramic Tiles Using Transfer

Learning Models

Shanthakumari R, Mamtha B, Mohamed Haarith J, Jaswanth J and Varadhaganapathy S

Department of Information Technology and Enginnering, Kongu Engineering College, Tamilnadu, India

Keywords: Transfer Learning Using Convolutional Neural Network VGG16, MobileNetV2, AlexNet Defect Detection

and Classification Ceramic Tiles.

Abstract: The ceramic tile enterprise is dealing with sizable challenges, particularly in growing countries with every

everyday old technologies. The complex nature of the ceramic tile technology prepare often comes

approximately in surface abandons inside the closing objects. Customarily, the class and reviewing of these

gadgets rely upon human assessment, that can cause errors and irregularities. that is specifically concerning

whilst tiles are utilized in legacy buildings just like the Taj Mahal, where defects could compromise such

ancient structures' beauty and integrity. Therefore, it's miles essential to implement an automatic illness

detection and type system to make sure that handiest tiles are used. in this paper, we recommend a version the

usage of well-hooked up Convolutional Neural Networks (CNNs), which have been hired to discover and clas

sify floor defects in ceramic tiles, attaining superior overall performance. Making use of those superior

fashions ensures the tiles are very well inspected earlier than use, stopping any ability harm to crucial heritage

web sites. The outcomes exhibit the effectiveness of this approach in surpassing present accuracy benchmarks,

offering a reliable answer for the ceramic tile industry.

1 INTRODUCTION

In production, disorder detection is crucial for

ensuring product quality and maintaining efficient

manufacturing processes. Early detection allows for

corrective actions, such as replacing machine tools or

performing maintenance, to maintain process

performance and reduce material waste. Defect

detection typically precedes machine maintenance

diagnostics and determines whether a product from a

process or vendor should be accepted or rejected.

Traditionally, this relied on manual inspection, but

with increasing automation in manufacturing,

automated defect detection systems have become

essential.

One common approach involves analyzing is the

surface images to identify defects. Widespread

research has combined traditional image processing

techniques such as edge detection, grayscale

thresholding, and image segmentation defect patterns

are continuous and contrast with the background.

However, the ceramic tile industry, especially in

developing countries faces challenges due to

outdated technologies and reliance on manual

inspection. Many manufacturers struggle with quality

control, leading to manufacturers struggle with

manufacturers struggle with quality control, leading

to defective products and misclassified tiles. Worker

fatigue and subjective judgment further exacerbate

these issues. Addressing these challenges is critical to

improving quality control and ensuring accurate

defect detection, particularly in high-stakes

applications like heritage sites and legacy buildings.

2 LITERATURE REVIEW

Image training is broadly applied for imperfection di

scovery and type in a technology. Karimi and

Asemani (Elbehiery, Hefnawy, et al. , 2007) remoted

into four primary techniques to deformity place and

type counting sifting techniques, basic calculations,

version-based totally techniques, and authentic

strategies. Having an area in the sifting method,

neural systems are generally utilized (Wan, Fang, et

al. , 2022). A. Tile floor Imperfection the invention

of ceramic tiles is a important scholarly subject.

numerous associated investigations have come about.

Zhang et al. (Lu, Lin, et al. , 2022) outlined and in

R, S., B, M., J, M. H., J, J. and S, V.

Automated Defect Detection in Ceramic Tiles Using Transfer Learning Models.

DOI: 10.5220/0013585400004664

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 1, pages 771-777

ISBN: 978-989-758-763-4

771

comparison, three discovery calculations, engaging

in designed ceramic tile deformity department

through threshold-based, flexible morphology, and

wavelet exchange combination strategies. Zhang et

al. (Hocenski, Vasilic, et al. , 2006) utilized the

advanced SSR calculation, saliency discovery, and

auxiliary place for complicated floor ceramic tile

floor imperfection distinguishing proof. Casagrande

et al. (Vasilic, Afshar, et al. , 2017) compared

spotlight extraction techniques, deciding on fractal

surface research and discrete wavelet trade,

optimized parameters with a hereditary calculation,

and utilized a classifier for deformity judgment. Haei

S H et al. (Karimi, Mishra, et al. , 2024) utilized a

nearby fluctuation rotation-invariant degree

administrator for deformity facet extraction and

bolster vector machines for imperfection type

acknowledgment.

Those calculations are all based totally on

preprocessing the image to kill commotion and at that

point using pertinent administrators to extricate or

improve imperfection area facts. those calculations

have wonderful influences while there may be a self-

obtrusive contrast among surrenders and foundation,

however when the imperfection measure is little or

there are expansive impedances with basis records,

the impact can be destitute. at the equal time, these

calculations as they were accomplish deformity

sector extraction, whereas in real era, shifting ahead

era productiveness by evaluating items agreeing to

imperfection degree and amount is of extremely good

significance et al. (Dong, Pan, et al. , 2024) proposed

an unsupervised mastering-based surface

imperfection discovery method, which utilized an

autoencoder and clustering calculation to extricate

and classify highlights from images, and at that point

applied morphological operations and related space

investigation to locate and fragment deformity areas.

This method does not require categorized statistics,

and might adaptively

manage numerous kinds

and surfaces of ceramic tile surfaces, but it could now

not be capable of viably distinguish complicated and

little abandons. Wang et al. show the N-DSCD

calculation, which combines conventional location

strategies with DCNN. This approach brings down

untrue discovery rates and makes strides framework

of common sense through a reference picture library

and synchronized comparisons. In any case, keeping

up an expansive reference picture library raises

capacity and computational costs. Wan et al.

proposed a profound learning strategy for ceramic tile

surface deformity location based on an adjusted

YOLOv5 arrangement and an information increase

method. Their strategy can viably distinguish

different sorts of abandons, such as splits, gaps,

stains, and scratches, on distinctive sorts of tiles, such

as coated, cleaned, and matte tiles. In any case, their

strategy may not be able to handle complex and

assorted foundations and may require more preparing

information and computational assets. Hocenski et

al.show an approach based on moving midpoints with

nearby contrasts. It is able as it were to identify a

constrained subset of blunders, those with tall

differences to10the encompassing region of the tile.

As a more common instrument for deformity location

in ceramic tiles, a few FE strategies.

3 PROPOSED METHODOLOGY

In these works, we present a sensible defect detection

system for the ceramic tile enterprise the usage of a

hybrid deep gaining knowledge of version to pick out

crack spots as defects at the manufacturing line. The

version is trained the usage of 12,483 photos of

ceramic tiles, with 9,988 snap shots used for

validation and 2,495 photos used for education. We

utilize three deep getting to know models: AlexNet,

VGG16, and MobileNetV2, every contributing

precise strength to improve crack detection accuracy.

The AlexNet model consists of 5 convolutional

layers, three max-pooling layers, 2 normalized layers,

2 fully connected layers, and 1 SoftMax layer. The

convolutional layers are responsible for function

extraction, where filters experiment the input snap

shots to discover patterns together with cracks. each

convolutional layer uses a ReLU (Rectified Linear

Unit) activation feature, which introduces non-

linearity to assist the version analyze complicated

functions. The absolutely linked layers combine the

extracted capabilities for type, with the SoftMax layer

outputting whether a tile is faulty or not The VGG16

demonstrate, too referred to as VGGNet, is a 16-layer

convolutional neural arrange that contains 13

convolutional layers and three absolutely related

layers. Its deep structure and steady use of

convolutional layers

make it distinctly effective for extracting relevant

capabilities. MobileNetV2 a lightweight

convolutional of the rectified system of the neural

organize is specially mentioned for portable and

implanted imaginative and prescient applications. It

makes use of a green architecture with intensity-

sensiable separable convolutions, which noably

reduces the variety of parameters without

compromising cracks in resource confined the time

neural organize, is specially mentioned for portable

and implanted imaginative and prescient applications.

INCOFT 2025 - International Conference on Futuristic Technology

772

It makes use of a green architecture with intensity-

sensible separable convolutions, which notably

reduces the variety of parameters without

compromising accuracy. Fig[1] This makes

MobileNetV2 ideal for real- time detection of defects,

which includes cracks, in resource- confined

environments like manufacturing traces. Its

optimized design ensures that it can handle the

demands of live defect detection. By leveraging the

strengths of AlexNet, VGG16, and MobileNetV2 in a

hybrid approach, the model enhances feature

extraction and improves the accuracy of detecting

cracks in ceramictiles. The hybrid method combines

the best aspects of each model, ensuring high

performance in the detection system. Moreover,

transfer learning is used with pre- trained networks to

enhance performance even more. This intelligent

defect detection system contributes to improved

quality control in the ceramic tile production process

by reliably identifying defects like cracks in real time.

Figure 1: Architecture of Proposed Methodology

3.1 Common Surface Defects Of

Ceramic Tiles

The ceramic tile manufacturing technique is intricate

and involves a couple of degrees, every of which

performs an important function in shaping the final

product's fine. The stages regularly incorporate crude

fabric arrangement, blending, crushing, shower

drying, shaping, drying, coating, terminating,

classification, and bundling. As ceramic tiles pass

through these stages, there is a risk of defects

emerging, particularly during sensitive processes like

firing and glazing. Among the numerous defects that

can appear, two are not ably common and

significantly affect both the tile’s structural integrity

and aesthetic appeal:

3.1.1 Crack Defect

One of the most common and obvious flaws in

ceramic tiles is cracking. These flaws show up as

cracks or fissures that are evident on the tile's

surface. Cracked tiles are generally considered

unsuitable for sale and may need to be discarded or

recycled.

3.1.2 Spot Imperfection

Spot imperfections refer to the presence of

discoloured, uneven, or raised spots on the surface of

ceramic tiles. Some spots may be small and blend in

with the tile’s pattern, while others can be large and

starkly visible, making the tile unsuitable for high-

quality finishes. Two common surface defects.

3.2 Image Augmentation And

Preprocessing

In the context of detecting defects in ceramic tiles,

image augmentation and preprocessing play vital

roles in preparing high-quality images for training

deep learning models. Here’s how these processes

are applied specifically to ceramic tile defect

detection

3.2.1 Data Preprocessing

Image preprocessing ensures that all images used for

training the model are clean, consistent, and ready for

feature extraction. By improving the contrast of the

pictures, histogram equalization makes flaws simpler

to see and identify. These preprocessing steps

improve the satisfactory of the input statistics,

helping the model perform greater efficiently in

figuring out tile defects in the course of production.

Automated Defect Detection in Ceramic Tiles Using Transfer Learning Models

773

Figure 2: crack

Figure 3: Spot

3.2.2 Data Augmentation

Image augmentation in ceramic tile defect detection

involves transforming the original images to create a

more diverse and comprehensive dataset. Translation

shifts the image slightly, making the model robust to

minor changes in tile positioning. Colour jittering

introduces slight colour variations to mimic glazing

inconsistencies, and adding noise to images teaches

the model to focus on relevant defects rather than

small artifacts or noise.

3.3 Adopting Transfer Learning

Through Pre-Trained Network

Models

After preprocessing highlights utilizing

convolutional techniques, the precision of the

models is thoroughly tried. To accomplish

upgraded and exact comes approximately, the

introductory show reviews pleasant- tuning via

alternate studying. Exchange learning empowers

integrating pre- trained thick neural arrange models,

such as VGG-16, AlexNet, and MobileNetV2, with

recently created models for successful extraction.

This approach essentially decreases generalization

blunders and streamlines the preprocessing of the

dataset. In this work, the yield from a layer going

before the last yield layer of the pre-trained organize

is joined into the recently outlined profound learning

show,workingas a modern coordinate including

extractor. Sometime recently include extraction, the

input picture tests must be resized to coordinate the

required arrange of the pre- trained arrange models,

particularly 224x224 pixels for VGG models. Once

the highlights are extricated, the yield layer of the

show identifies and classifies imperfect tiles, in

this way calculating the misfortune and precision

measurements. This setup encourages the real-time

recognizable proof of inadequate tiles on the

generation line, guaranteeing proficient quality

control and improved operational viability in

ceramic tile fabricating. MobileNetV2 is planned

with a center on effectiveness and moo idleness,

making it especially reasonable for inserted

applications. Its design makes use of depth-clever

distinguishable convolutions, which essentially

decrease the number of parameters whereas

retaining up tall precision in identifying surrenders

along with breaks and notice defects in ceramic

tiles. After resizing, the show extricates highlights

from the pictures, and the yield layer classifies

the tiles, calculating misfortune and exactness

measurements to assess execution. For AlexNet,

the resizing necessity remains steady at 224x224

pixels for input pictures. AlexNet’s engineering

comprises numerous convolutional layers that

viably capture. Perplexing highlights from the tile

pictures. The model’s plan joins ReLU actuation

capacities, pooling layers, and a SoftMax layer for

classification, encouraging the location of

abandons with tall accuracy. After highlight

extraction, the yield layer recognizes and classifies

imperfect tiles, computing the comparing

misfortune and exactness measurements to screen

the model’s execution.

3.3.1 AlexNet

The layers that make up AlexNet are one SoftMax

layer, 3 max-pooling layers, absolutely linked layers,5

convolution layers, and Normalized layers. The

layers that make up AlexNet are one SoftMax layer,

3 max-pooling layers, two completely related layers,

5 convolution layers, and two Normalized layers. A

non-linear activation characteristic called "ReLU"

plus a convolution clear out make up each

convolution layer. The max-pooling characteristic is

executed with the aid of the pooling layers, and

because absolutely linked layers are gift, the input

length is fixed.

INCOFT 2025 - International Conference on Futuristic Technology

774

Figure 4: AlexNet Architecture

The architecture of AlexNet starts offevolved

with an input photo size of 227x227x3. the primary

layer is a convolutional layer with ninety six filters of

length 11x11 and a stride of 4. The activation feature

used on this layer is ReLU, generating an output

feature map of 55x55x96. the subsequent layer

applies max- pooling with a filter out size of 3x3 and

a stride of 2, decreasing the function map to

27x27x96. Following applying 256 5x5 filters with a

stride of one and ReLU activation to the pooling

layer, the second one convolution operation is

executed.The resulting feature map stays at

27x27x96. applying an additional max-pooling layer

with a 3x3 filter out length and 2 stride outcomes in

a characteristic map this is 13x13x256. using 384 3x3

filters, a stride of one, and ReLU activation, the 0.33

convolution layer generates a 13x13x384

characteristic map.the use of ReLU activation all

over again, the fourth convolution operation

preserves the 13x13x384 feature map length with 384

filters of size 3x3. the usage of 256 3x3 filters with a

stride of 1 and ReLU activation, the fifth convolution

layer a 13x13x256 function the subsequent of the

primary layer due to

map A final max-pooling layer is then

implemented with a filter length of 3x3 and a stride of

2, decreasing the feature map to 6x6x256. The

output is flattened and processed via absolutely

related (FC) layers following the convolutional and

pooling layers. the first FC layer has 9216 units with

ReLU activation, accompanied by using greater FC

layers, each with 4096 devices and ReLU activations.

The input picture is classified into considered one of

one thousand categories the use of a softmax

activation function inside the very last output layer,

which has 1000 devices in overall.

• output= ((Input-filter size)/ stride)+1

3.3.2 VGG16

The convolutional neural community model known

as the VGG model, or VGGNet, that helps 16 layers

is also referred to as VGG16, together with 16 layers,

which include thirteen convolutional layers and 3

fully linked layers. The VGG-sixteen is renowned for

its effectiveness and ease of use, in addition to for its

versatility in handling a range of computer

imaginative and prescient programs, inclusive of

object recognition. and image categorization. The

model is designed with a series of convolutional

layers followed via a stack of gradually deeper max-

pooling layers.

Figure 5: VGG16 Architecture

Images of 224x224 pixels can be entered into the

VGGNet. To keep the enter size for the ImageNet

opposition steady, the model creators eliminated the

middle 224x224 patches from each picture. The

convolutional regions of VGG use 33, the smallest

workable receptive discipline, to seize motion from

left to proper and as much as down. moreover, 11

convolution filters are used to transform the enter

linearly. the following factor is a ReLU unit, an

essential improvement past AlexNet that shortens

training instances. The piecewise linear feature called

the Rectified Linear Unit Activation function, or

ReLU, outputs the enter if the enter is nice and returns

zero in any other case. To hold the spatial resolution

after convolution, the convolution stride—that's the

quantity of pixels shifts over the input—is ready at 1.

ReLU is activation function of the stride.

Utilized by the VGG community's hidden layers

all.With VGG, neighborhood reaction normalization

(LRN) is generally avoided because it lengthens

schooling times and makes use of greater

reminiscence. furthermore, it would not enhance

accuracy overall.The VGGNet consists of three

absolutely linked layers. while the 1/3 layer carries

one thousand channels—one channel for every

magnificence—the primary degrees each have 4096

channels.

Automated Defect Detection in Ceramic Tiles Using Transfer Learning Models

775

3.3.3 MobileNetV2

A pre-educated version is a community it really is

already been trained on a massive dataset and stored,

which lets in you to use it to customise your model

affordably and successfully. MobileNetV2, a

lightweight convolutional neural community (CNN)

architecture, is supposed often for embedded and cell

vision packages. It turned into created by means of

Google researchers as an improvement to the initial

MobileNetV2 version. This model's ability to

efficaciously balance model size and precision makes

it best for gadgets with limited resources, that is

another remarkable characteristic.

Figure 6. MobileNetV2 Architecture

The enter image length for the structure is

224x224x3 . the first layer is a convolutional layer

with 32 filters of length 3x3 and a stride of one,

producing an output characteristic map of

112x112x32 using the ReLU activation

characteristic. this is followed by several bottleneck

layers: the second layer has sixteen filters with a 1x1

kernel and a stride of two, lowering the function map

length to 112x112x16. The third layer applies 24

filters with a 3x3 kernel and a stride of one, keeping

the function map at 56x56x24, at the same

time as the fourth layer has 24 filters with a 3x3

kernel and a stride of 2, in addition decreasing the

scale to 56x56x24. subsequent bottleneck layers

continue this pattern: the fifth layer has 32 filters with

a 3x3 kernel and a stride of 1 for an output of

28x28x32; the 6th and 7th layers practice 32 filters

every with the same kernel length but extraordinary

strides, resulting in a discount to 28x28x32. The 8th

layer introduces 64 filters with a 3x3 kernel and a

stride of one, generating a characteristic map of

14x14x64. Layers nine through eleven practice 64

filters every, maintaining the characteristic map

length of 14x14x64. The 12th and 13th layers growth

the filters to 96, keeping the dimensions at 14x14x96.

The fourteenth and 15th layers follow 160 filters with

a 3x3 kernel and a stride of 1, resulting in a function

map of 7x7x160. in the 16th layer, 320 filters with a

1x1 kernel and a stride of one are employed. A final

convolutional layer with 1280 filters and a 1x1 kernel

produces an output feature map of 7x7x320. After

this, a international common pooling layer reduces

the characteristic map to 1x1x1280 earlier than the

structure culminates in a completely related layer

with a thousand devices and a softmax activation

feature, classifying the enter image into one among

one thousand categories.

• Algorithm for Testing Phase

4 RESULTS AND DISCUSSION

This section compares the suggested algorithm with

current techniques and examines its performance over

a range of training and testing data sizes. Accuracy

measurements are computed once the performance

metrics of the suggested method are assessed. The

methodology's resilience and efficiency are

showcased by the experimental findings, which show

that it can attain an accuracy of up to 98.2% under

ideal learning settings. Notably, the suggested

methodology was applied with Jupyter Notebook and

the Spyder IDE, utilizing key support libraries as

Matplotlib, Scikit-learn, NumPy, Pandas, and Keras.

The deep learning model's critical metrics, such as

accuracy evaluation and loss metrics, were measured

using the same tool chain. The result of with an

astounding accuracy of 98.2%, MobileNetV2 proved

to be the most effective model among those put to the

test for spotting flaws in ceramic tiles.

5 CONCLUSION AND FUTURE

WORK

This project demonstrates the potential of deep

learning, specifically Convolutional Neural Networks

(CNNs), in automating the defect detection and

classification process in the ceramic tile industry. By

employing advanced CNN architectures like

AlexNet, MobileNetV2, and VGG16, the proposed

system achieves high accuracy in identifying surface

defects, outperforming traditional manual inspection

methods. This automatic detection system can

significantly reduce errors caused by human fatigue

and subjective judgment, leading to better quality

control, reduced waste, and more efficient production

processes. Moreover, the system ensures that only

high-quality ceramic tiles are used in critical

applications, such as heritage and legacy buildings,

INCOFT 2025 - International Conference on Futuristic Technology

776

where the aesthetic and structural integrity of the tiles

is crucial. Moving forward, several improvements

can be made to enhance the model's performance and

scalability. First, integrating real-time defect

detection in production environments can be

explored, enabling manufacturers to make immediate

corrective actions. Further optimization of the CNN

model through hybrid techniques, such as combining

genetic algorithms with CNNs, can lead to more

precise results and faster computation. Additionally,

expanding the dataset to include more diverse tile

patterns and defect types will improve the model's

robustness and generalization. Lastly, incorporating

the system into a fully automated manufacturing line

with real-time monitoring and feedback will help

realize the full potential of Industry 4.0 in the ceramic

tile sector.

REFERENCES

Elbehiery. H., Hefnawy. A., Elewa, M. (2007). Surface

defects detection for ceramic tiles using image

processing and morphological techniques.

International Journal of Computer and Information

Engineering, 1(5): 1488-1492

Wan G, Fang H, Wang D, Yan J, Xie B. Ceramic tile

surface defect detection based on deep learning.

Ceramics International. 2022 Apr 15;48(8):11085-93.

Lu Q, Lin J, Luo L, Zhang Y, Zhu W. A supervised

approach for automated surface defect detection in

ceramic tile quality control. Advanced Engineering

Informatics. 2022 Aug 1;53:101692.

Hocenski Z, Vasilic S, Hocenski V. Improved canny edge

detector in ceramic tiles defect detection. InIECON

2006-32nd Annual Conference on IEEE Industrial

Electronics 2006 Nov 6 (pp. 3328-3331). IEEE.

Hanzaei SH, Afshar A, Barazandeh F. Automatic detection

and classification of the ceramic tiles’ surface defects.

Pattern recognition. 2017 Jun 1;66:174-89.

Karimi N, Mishra M, Lourenço PB. Deep learning- based

automated tile defect detection system for Portuguese

cultural heritage buildings. Journal of Cultural

Heritage. 2024 Jul 1;68:86-98.

Dong G, Pan X, Liu S, Wu N, Kong X, Huang P, Wang Z.

A review of machine vision technology for defect

detection in curved ceramic materials. Nondestructive

Testing and Evaluation. 2024 Sep 22:1-27.

Ateeq M, PP Abdul Majeed A, Hafizh H, Mohd Razman

MA, Mohd Khairuddin I, Noordin NH. A Feature-

Based Transfer Learning Method for Surface Defect

Detection in Smart Manufacturing. InInnovative

Manufacturing, Mechatronics & Materials Forum 2023

Aug 7 (pp. 455-461)

Ameri R, Hsu CC, Band SS. A systematic review of deep

learning approaches for surface defect detection in

industrial applications. Engineering Applications of

Artificial Intelligence. 2024 Apr 1;130:107717.

Martin D, Heinzel S, von Bischhoffshausen JK, Kühl N.

Deep learning strategies for industrial surface defect

detection systems. arXiv preprint arXiv:2109.11304.

2021 Sep 23.

Lian J, Jia W, Zareapoor M, Zheng Y, Luo R, Jain DK,

Kumar N. Deep-learning-based small surface defect

detection via an exaggerated local variation-based

generative adversarial network. IEEE Transactions on

Industrial Informatics. 2019 Oct 4;16(2):1343-51.

Choi SM, Cha HS. Methodology for Enhancing the

Accuracy of Defect Detection Models Through Virtual

Image Augmentation. Available at SSRN 4711763.

Prunella M, Scardigno RM, Buongiorno D, Brunetti A,

Longo N, Carli R, Dotoli M, Bevilacqua V. Deep

learning for automatic vision- based recognition of

industrial surface defects: a survey. IEEE Access. 2023

May 1;11:43370-423.

Zorić B, Matić T, Hocenski Ž. Classification of biscuit tiles

for defect detection using Fourier transform features.

ISA transactions. 2022 Jun 1;125:400-14.

Automated Defect Detection in Ceramic Tiles Using Transfer Learning Models

777