Deep Learning Techniques for Archaeological Image Restoration

Ajinkya Kulkarni, Prajwal Naduvinamath, Ganesh Naik

Sneha Totad, Uday Kulkarni and Shashank Hegde

School of Computer Science and Engineering, KLE Technological University, Hubli, Karnataka, India

Keywords:

Cultural and Archaeological Sites Restoration, Image Reconstruction, UNetGAN, SSIM, IHDS.

Abstract:

Archaeological sites, rich in historical and cultural signiﬁcance, face deterioration from invasive, environmen-

tal, biological and natural factors, necessitating innovative restoration methods. The proposed paper introduces

a U-Net-based Generative Adversarial Network (GAN) framework to reconstruct damaged temple images, en-

suring the preservation of intricate architectural details. A custom dataset of masked images was created, and

the model was trained to reconstruct missing sections while balancing adversarial and reconstruction losses

for realistic outputs. The proposed approach addresses challenges in traditional techniques by restoring com-

plex textures, enhancing ﬁne details and producing visually coherent results, achieving a Structural Similarity

Index Measure (SSIM) of 0.7128. Furthermore, the framework demonstrates robustness in handling various

levels of damage and noise, paving the way for scalable applications in heritage conservation. The proposed

work contributes signiﬁcantly to cultural heritage preservation by combining advanced deep learning method-

ologies with precise evaluation metrics to achieve impactful results.

1 INTRODUCTION

India’s rich history, encapsulated in its archaeological

sites, particularly temples, reﬂects immense histori-

cal, cultural, and spiritual signiﬁcance. These tem-

ples, however, face substantial degradation due to in-

vasions, natural disasters and environmental factors

like earthquakes, climatic ﬂuctuations and biologi-

cal factors such as algae and fungal growth, lead-

ing to the loss of invaluable architectural and artis-

tic details (Smith and Patel, 2021; Sharma and Has-

san, 2022). Restoration and preservation of these

monuments have become a global priority, emphasiz-

ing their historical and cultural importance (Park and

Wang, 2021), and the need for scalable and precise

techniques to combat widespread deterioration.

Traditional restoration approaches, though effec-

tive to an extent, struggle with complex patterns and

textures. Advances in Artiﬁcial Intelligence (AI),

speciﬁcally Deep Neural Networks (DNNs), offer

transformative solutions (Mishra, 2021; El Masri and

Rakha, 2021). DNNs excel in learning patterns from

datasets to reconstruct damaged areas with remark-

able precision(Kumar and Wang, 2020), preserving

intricate details such as temple carvings (Jones and

Alvi, 2022; Chen and Smith, 2022). Techniques

like Convolutional Neural Networks (CNNs) (Lee

and Gupta, 2023) and Generative Adversarial Net-

works (GANs) (Gupta and Lee, 2023) provide excep-

tional capabilities in feature extraction, structural in-

painting, texture generation, and reconstructing the

missing areas (Hassan and Sharma, 2022), ensuring

authenticity and structural ﬁdelity in archaeological

applications (Chen and Gupta, 2023).

U-Net architectures have demonstrated excep-

tional promise in restoring intricate temple carvings

and architectural details, thanks to their encoder-

decoder structure with skip connections, enabling

ﬁne-grained reconstructions. The proposed approach

employs an optimized U-Net architecture within a

GAN framework, incorporating a U-Net-based gener-

ator and a PatchGAN discriminator. The U-Net gen-

erator ensures detailed texture and structural align-

ment, while the PatchGAN discriminator enhances

output realism. The proposed combination addresses

unique challenges, such as varying textures and com-

plex architectural features, resulting in visually con-

vincing and structurally accurate reconstructions.

Section 2 reviews recent advancements in deep

learning for image restoration. Section 3 details

the proposed methodology, which integrates U-Net

for structural restoration and GAN for realistic tex-

ture synthesis, enabling the reconstruction of dam-

aged and missing features in ancient temple imagery.

Kulkarni, A., Naduvinamath, P., Naik, G., Totad, S., Kulkarni, U. and Hegde, S.

Deep Learning Techniques for Archaeological Image Restoration.

DOI: 10.5220/0013608100004664

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 3, pages 26-32

ISBN: 978-989-758-763-4

Section 4 describes the dataset and analyzes the re-

sults, demonstrating the approach’s effectiveness in

preserving the architectural essence of heritage sites.

Section 5 concludes by summarizing the contributions

of the proposed approach.

2 BACKGROUND STUDY

A comprehensive review of literature highlights ad-

vancements in using deep learning techniques such as

U-Net architectures and GANs for tasks like recon-

struction, restoration and in-painting (Zuo and Tidde-

man, 2024; Kulkarni et al., 2023). U-Net’s encoder-

decoder structure, with skip connections, has been ex-

tensively applied to image-to-image tasks, including

restoring damaged archaeological sites and facial im-

age voids (Zhao et al., 2024; Schonfeld et al., 2020).

Multi-task approaches, such as multi-scale fusion, en-

able models to address diverse objectives while pre-

serving high-resolution details and semantic coher-

ence (Kwabena Patrick et al., 2022). Quality evalua-

tion metrics like Structural Similarity Index Measure

(SSIM) and Peak Signal-to-Noise Ratio (PSNR) are

critical for assessing model performance (Feng Cai,

2024).

Building upon these advancements, recent studies

have demonstrated the potential of hybrid architec-

tures that combine U-Net and GAN capabilities to en-

hance restoration accuracy. By leveraging GANs’ ad-

versarial training paradigm, these hybrid models gen-

erate outputs that are not only structurally consistent

but also visually realistic, addressing common chal-

lenges such as texture smoothness and color discrep-

ancies. Additionally, the inclusion of attention mech-

anisms and transformer-based layers has further im-

proved the ability of these networks to focus on crit-

ical features while ignoring irrelevant artifacts. Such

innovations have shown signiﬁcant promise in han-

dling complex restorations, such as recreating intri-

cate carvings or patterns on archaeological artifacts,

ensuring a seamless integration of modern technology

with cultural preservation efforts.

The proposed methodology addresses limitations

in traditional architectures like autoencoders and

CNNs, which struggle with complex patterns and

ﬂuctuating datasets (Zhou et al., 2021; Nguyen and

Tran, 2022). By leveraging U-Net and incorporating

GAN frameworks with adversarial and reconstruc-

tion losses, more realistic and high-ﬁdelity outputs are

achieved (Wang and Tang, 2021; Shen and Li, 2021).

Metrics like SSIM and PSNR provide structural simi-

larity and noise-level evaluation, essential for validat-

ing results in real-world scenarios, including denois-

ing, in-painting, and restoration of corrupted images

(Lee and Kim, 2022; Huang and Zhang, 2021). Op-

timization strategies like curriculum learning and at-

tention mechanisms further enhance outcomes in low-

context scenarios(Patel and Gupta, 2022).

Furthermore, the integration of domain-speciﬁc

pretraining and transfer learning techniques has been

instrumental in improving the model’s adaptability to

niche datasets, such as those featuring archaeological

artifacts. These techniques enable the model to gen-

eralize effectively from limited training data by lever-

aging knowledge from larger, more diverse datasets.

In addition, advanced loss functions, such as percep-

tual loss and contextual loss, have been adopted to

prioritize the preservation of ﬁne-grained details and

contextual relevance during reconstruction. This en-

sures that the restored images maintain their histor-

ical and cultural authenticity while achieving supe-

rior quantitative performance across evaluation met-

rics. The proposed methodology also demonstrates

potential scalability, making it feasible for large-scale

restoration projects involving extensive datasets.

Training stability in GAN-based models is main-

tained by balancing adversarial and reconstruction

losses, mitigating challenges like mode collapse and

vanishing gradients (Chen and Zhao, 2021). The

Adam optimizer, with parameters tuned to a learning

rate of 1e-4 and betas of 0.5 and 0.999, ensures ef-

ﬁcient convergence and avoids instability associated

with Stochastic Gradient Descent (SGD) (Singh and

Verma, 2021; Liu and Sun, 2022). By integrating

U-Net and GAN architectures with advanced opti-

mization techniques, the approach facilitates robust

restorations tailored to demanding scenarios involv-

ing missing or noisy data (Zhang and Luo, 2022).

In addition to optimization strategies, regulariza-

tion techniques such as spectral normalization and

gradient penalty have been employed to further en-

hance the stability of GAN training. These meth-

ods effectively constrain the discriminator’s learning

process, preventing it from becoming overly domi-

nant, which can disrupt the generator’s performance.

Moreover, progressive training methodologies, where

models are trained in incremental stages with increas-

ing complexity, have shown signiﬁcant improvements

in handling high-resolution image restoration tasks.

By combining these approaches with data augmen-

tation strategies, such as random masking and noise

injection, the framework ensures robust performance

across diverse datasets while preserving computa-

tional efﬁciency and generalization capabilities.

Deep Learning Techniques for Archaeological Image Restoration

3 PROPOSED WORK

The combination of the U-Net architecture with

the GAN framework provides a robust approach

to precise and realistic cultural heritage restoration.

The U-Net-based generator excels at capturing ﬁne-

grained features while preserving spatial information,

whereas the PatchGAN (Isola et al., 2018) discrim-

inator ensures that the outputs are both realistic and

structurally coherent. The proposed combination en-

ables the model to effectively learn and reconstruct in-

tricate architectural patterns, establishing a new stan-

dard for precision in restoration tasks.

3.1 U-Net Generator Architecture

Figure 1: Architecture overview of U-Net GAN.

The U-Net architecture is integrated with the

GAN framework to reconstruct damaged image re-

gions as shown in the Figure 1. During preprocess-

ing, the pixel values of images are normalised which

are further converted into tensor formats thus making

them suitable for deep learning models.

The U-Net generator operates as an encoder-

decoder network which is enhanced with skip con-

nections. With the usage of convolutional layers,

batch normalization and leakyReLU activation, en-

coder gradually reduces the spatial dimensions of the

input image while extracting complex features.The

decoder is responsible for upsampling the encoded

image i.e., the encoded features are upsampled to their

original resolution. The decoder uses transposed con-

volutional layers with ReLU activation. Skip connec-

tions used in the network ensure that the feature infor-

mation is transferred from corresponding encoder lay-

ers to decoder layers in a thorough way ensuring that

the spatial features are retained.The generator uses a

TanH activation function to generate the reconstructed

images by scaling values between -1 and 1.

3.2 Discriminator Network

A discriminator network is trained alongside with

generator to ensure that the generated images ap-

pear realistic. The discriminator consists of multi-

ple convolutional layers that gradually reduce input

image into a scalar probability score providing an

indication of an image being real (ground truth) or

fake(reconstructed by generator). The proposed ad-

versarial design stimulates generator to improve its

reconstructions, causing difﬁculty for discriminator to

distinguish real images from fake ones.

During the forward pass for real images,the dis-

criminator calculates the real loss as denoted in Equa-

tion 1

real

= −

∑

i=1

logD(y

) (1)

where D(y

) represents the discriminator’s output for

the real image y

, and N is the total number of real

images in the batch.The motive is to make D(y

) close

to 1.

It is implemented using Binary Cross-

Entropy(BCE) loss denoted below in Equation 2.

real

= BCE(D(y

), 1), (2)

Then, in the forward pass of fake images, the

generator ﬁrst generates fake images (G(x

)) from

masked images(x

). Then, the discriminator will

measure those generated images to compute the fake

loss(L

ake) as shown in the Equation 3

fake

= −

∑

i=1

log(1 − D(G(x

))), (3)

The aim is to encourage D(G(x

)) to be close to 0,

which means that the discriminator must identify the

generated images as fake. It is also calculated using

Binary Cross Entropy loss as indicated in Equation 4

fake

= BCE(D(G(x

)), 0), (4)

Finally, the total discriminator loss (L

)combines

the real and fake losses to maximize the discrimina-

tor’s performance at being able to distinguish between

real and generated images. It is calculated as shown

in Equation 5:

real

+ L

fake

), (5)

The combined loss is decreased through backward

propagation which enables the discriminator to distin-

guish between real and fake images.

INCOFT 2025 - International Conference on Futuristic Technology

3.3 Adversarial Training

The generator is trained to fool the discriminator

while simultaneously producing realistic images that

resemble the target images. In the forward pass, ad-

versarial loss L

adv

is calculated using the output gen-

erated by the discriminator on the fake images.The

adversarial loss formula ,as shown in Equation 6:

adv

= −

∑

i=1

logD(G(x

)), (6)

The loss aids the generator in learning to gen-

erate pictures that confuses the discriminator. It is

calculated using Binary Cross-Entropy(BCE) loss, as

shown in Equation 7:

adv

= BCE(D(G(x

)), 1), (7)

where the target label is set to 1 for marking that the

fake images should be classiﬁed as actual.

3.4 Reconstruction Loss

The reconstruction loss in-terms of pixel-wise accu-

racy ensures that the reconstructed images match the

ground truth.The generator tries to minimize the dif-

ference between the generated and the actual target

images using the reconstruction loss L

ec. the for-

mula for the loss, as shown in Equation 8:

rec

∑

i=1

∥G(x

) − y

∥

, (8)

where ∥G(x

) − y

∥

denotes the L1 norm (Mean Ab-

solute Error), which minimizes the pixel-wise differ-

ence between the generated image and the target im-

age.

Finally, a total generator loss is calculated that

combines both adversarial and reconstruction losses

,as shown in Equation 9:

= λ

rec

· L

rec

+ λ

adv

· L

adv

, (9)

where λ

rec

and λ

adv

are the weights that control the

inﬂuence of the reconstruction and adversarial losses,

respectively. Using the loss, the generator adjusts its

parameters to maximize both realism and ﬁdelity to

the target images.

3.5 Evaluation Metrics

The PSNR and SSIM metrics are used to evaluate the

performance of the model. PSNR checks the differ-

ence between the reconstructed and target image to

estimate the pixel-level accuracy. It calculates the ra-

tio between the maximum possible pixel value of the

image and the Mean Squared Error (MSE) between

the generated and target images. SSIM whose value is

between 0 to 1, measures the kind of structural and vi-

sual quality of the original in the reconstruction. The

SSIM is responsible for measuring the visual similar-

ity between the generated image and the target image.

The overall methodology portrays an effective

blend of architectural design and adversarial train-

ing aimed at achieving high-quality image reconstruc-

tions.

4 RESULTS AND ANALYSIS

The proposed work outlines the experimental results

of the proposed framework, including details on the

dataset, prepossessing steps, and the training proce-

dure. The model’s performance is evaluated using

SSIM as the primary metric to assess the quality of

restoration and its effectiveness in addressing com-

plex challenges.

4.1 Dataset Description

The IHDS dataset, containing 3,000 high-resolution

images of temples as shown in Figure 2, was utilized

to train the model. Since the dataset only provided

intact temple images (target images), corresponding

masked images were generated to simulate structural

damage. To create realistic masked images, methods

such as adding inconsistent or random patches were

avoided, as they do not accurately represent true dis-

tortions. Instead, background patches were overlaid

on temple regions, effectively mimicking structural

damage or fragmentation. The proposed approach en-

sured that the masked images closely resembled real-

world scenarios of temple degradation.

Figure 2: Dataset of Damaged Archaeological Sites

4.2 Evaluation

After preparing the dataset of masked images, the

model was trained using a UNet-based GAN archi-

Deep Learning Techniques for Archaeological Image Restoration

tecture with a PatchGAN discriminator. During train-

ing, the U-Net generator focused on reconstructing

the missing parts of the masked images, while the

PatchGAN discriminator ensured the realism of these

reconstructions. Once training was complete, the gen-

erator was used to produce the ﬁnal reconstructed

images, effectively restoring the damaged regions as

shown in Figure 3.

(a) (b)

(e) (f)

Figure 3: Examples of masked images (left column) and

their corresponding reconstructed images (right column).

The SSIM scores are as follows: (b): 0.7538, (d): 0.8235,

(f): 0.8012.

The performance of the proposed UNetGAN

model is evaluated using the Structural Similarity In-

dex Measure (SSIM) to assess the quality of the re-

constructed images. SSIM evaluates the similarity be-

tween the reconstructed image and the ground truth,

reﬂecting the model’s effectiveness in restoring visual

details. For the provided archaeological images, the

model achieved an overall SSIM score of 0.7128 .The

score demonstrates the model’s capability to restore

omitted architectural details while successfully cap-

turing the structural and textural integrity of the orig-

inal images, aligning closely with the ground truth.

Upon visual inspection, the reconstructed images

exhibit a high level of detail preservation, effectively

capturing the intricate features of the original archi-

tectural forms. However, the analysis highlights the

inﬂuence of the quality and diversity of the train-

ing data on the model’s performance. Expanding

the dataset with a greater variety of high-resolution

ground truth images could further enhance the accu-

racy and reliability of the reconstructions.

Having established the model’s overall perfor-

mance through the SSIM, we now delve into spe-

ciﬁc examples of reconstructed images and their cor-

responding SSIM values in Figure 3. The ﬁrst ex-

ample is the Kappachenikeshwara Temple in Hassan,

which achieved an SSIM of 0.7581 as shown in Figure

3(b), indicating signiﬁcant reconstruction accuracy

while preserving intricate architectural details. Next,

the Veerabhadreshwara temple, Hangal recorded an

SSIM of 0.8589 as shown in Figure 3(d) reﬂecting

the model’s ability to capture the structural and textu-

ral intricacies of this iconic monument. Further, Haz-

ararama temple, Hampi attained an SSIM of 0.8661 as

shown in Figure 3(f), showcasing exceptional ﬁdelity

in reconstructing the missing portions while closely

aligning with the ground truth. These examples high-

light the model’s capability to effectively restore ar-

chaeological images, emphasizing the critical role of

diverse and high-quality datasets in improving recon-

struction accuracy and reliability.

5 CONCLUSION AND FUTURE

WORK

The potential for preserving cultural property is enor-

mous when deep learning algorithms are used to

recreate damaged archaeological pictures, especially

those of temples. The efﬁciency of integrating U-

Net architecture with a GAN framework was shown

in the study, allowing for excellent restoration while

preserving the originals’ ﬁne architectural elements

and aesthetic integrity. The suggested model outper-

forms conventional techniques in achieving durable

and realistic image in-painting by utilizing adversar-

ial training and reconstruction loss. Additionally, us-

ing sophisticated evaluation measure like SSIM guar-

antees thorough quality assessment, which strength-

ens the model’s dependability. The proposed model

has achieved an SSIM score of 0.7128. To further in-

crease reconstruction skills for varied cultural items,

future research can investigate improving training ef-

ﬁciency and combining multi-modal data.

The future scope can include advancement of

INCOFT 2025 - International Conference on Futuristic Technology

model architectures and techniques to capture ﬁner

details and complex patterns, alongside improved

data preprocessing and augmentation strategies for

robust results. Expanding to diverse datasets and in-

tegrating domain-speciﬁc knowledge could enhance

contextual accuracy and generalization. Additionally,

smoothing the blurred areas after reconstruction can

improve results.

REFERENCES

Chen, L. and Zhao, Y. (2021). Stabilizing gan train-

ing for image reconstruction. Neural Networks,

142:15–25.

Chen, W. and Gupta, S. (2023). Ensuring texture con-

tinuity in archaeological image restoration. Jour-

nal of Visual Computing for Cultural Heritage,

18:214–230.

Chen, W. and Smith, G. (2022). Generative ad-

versarial networks for cultural heritage restora-

tion. IEEE Transactions on Neural Networks

and Learning Systems, 32(12):1245–1260.

El Masri, R. and Rakha, T. (2021). Historic built en-

vironment assessment and management by deep

learning techniques: A scoping review. Journal

of Cultural Heritage, 49:236–247.

Feng Cai, Jingxu Peng, P. Z. (2024). Proceedings of

the 2023 international conference on data sci-

ence, advanced algorithm and intelligent com-

puting (dai 2023). 180.

Gupta, S. and Lee, M.-h. (2023). Optimizing u-net

for image restoration in archaeological research.

Pattern Recognition Letters, 145:112–123.

Hassan, A. and Sharma, P. (2022). Digital preser-

vation of cultural heritage through image recon-

struction. Digital Heritage Review, 28(4):289–

305.

Huang, X. and Zhang, M. (2021). Quantitative eval-

uation of image restoration models. Machine

Learning Applications, 13:123–140.

Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2018).

Image-to-image translation with conditional ad-

versarial networks.

Jones, R. and Alvi, N. (2022). Age-related degrada-

tion of archaeological structures and its restora-

tion. Heritage Science, 10:35–49.

Kulkarni, U., Chikkamath, S., Mirajkar, J. S., Hit-

talmakki, Y., Thota, V., and Khan, F. (2023).

Image inpainting on archeological dataset using

unet architecture on embedded platform. In In-

ternational Conference on Recent Trends in Ma-

chine Learning, IOT, Smart Cities & Applica-

tions, pages 353–365. Springer.

Kumar, R. and Wang, L. (2020). Deep learning for ar-

chaeological image reconstruction: Applications

and challenges. Artiﬁcial Intelligence in Archae-

ology, 34(3):245–267.

Kwabena Patrick, M., Felix Adekoya, A.,

Abra Mighty, A., and Edward, B. Y. (2022).

Capsule networks–a survey.

Lee, J. and Kim, S. (2022). Evaluation metrics for ai-

driven image restoration techniques. Computer

Graphics Forum, 41:412–425.

Lee, M.-h. and Gupta, S. (2023). Convolutional

neural networks for structural ﬁdelity in image

restoration. International Journal of Machine Vi-

sion, 12(8):145–157.

Liu, J. and Sun, H. (2022). Challenges and advances

in gan-based image reconstruction. Computer

Vision and Pattern Recognition Letters, 18:330–

345.

Mishra, A. (2021). Heritage preservation using deep

learning models for archaeological site restora-

tion. Advances in Science, Technology and En-

gineering Systems Journal, 6(4):1121–1126.

Nguyen, T. H. and Tran, P. D. (2022). Advanced

cnn techniques for cultural heritage preservation.

Journal of Computer Vision and Applications,

12:156–172.

Park, J.-h. and Wang, L. (2021). Artiﬁcial intelligence

for preserving heritage sites: A comprehensive

review. Cultural Heritage Journal, 22:101–120.

Patel, A. and Gupta, N. (2022). Restoration of cor-

rupted images using ai techniques. Journal of

Artiﬁcial Intelligence Research, 60:89–102.

Schonfeld, E., Schiele, B., and Khoreva, A. (2020).

A u-net based discriminator for generative ad-

versarial networks. In Proceedings of the

IEEE/CVF conference on computer vision and

pattern recognition, pages 8207–8216.

Sharma, P. and Hassan, A. (2022). Impact of biolog-

ical factors on cultural heritage sites. Journal of

Environmental Biology, 38(5):275–283.

Shen, L. and Li, Q. (2021). Adversarial loss optimiza-

tion in gans for image reconstruction. Pattern

Recognition Letters, 145:67–75.

Singh, R. and Verma, P. (2021). Optimization tech-

niques in deep learning models for restoration.

Journal of Machine Learning Research, 22:201–

218.

Deep Learning Techniques for Archaeological Image Restoration

Smith, J. and Patel, A. (2021). Cultural heritage

restoration: Challenges and priorities. Journal

of Heritage Conservation, 45(2):123–134.

Wang, H. and Tang, X. (2021). Skip connections in

u-net for enhanced detail preservation. Applied

Intelligence, 51:2924–2938.

Zhang, W. and Luo, T. (2022). Combining u-net and

gan for cultural heritage restoration. Heritage

Science, 10:45–56.

Zhao, F., Ren, H., Sun, K., and Zhu, X. (2024). Gan-

based heterogeneous network for ancient mural

restoration. Heritage Science, 12(1):418.

Zhou, Y., Zhang, W., and Wang, L. (2021). Improved

deep learning architectures for image restora-

tion. IEEE Transactions on Image Processing,

30:2345–2356.

Zuo, H. and Tiddeman, B. (2024). A u-net architec-

ture for inpainting lightstage normal maps. Com-

puters, 13(2).

INCOFT 2025 - International Conference on Futuristic Technology