Deep Learning Techniques for Archaeological Image Restoration
Ajinkya Kulkarni, Prajwal Naduvinamath, Ganesh Naik
Sneha Totad, Uday Kulkarni and Shashank Hegde
School of Computer Science and Engineering, KLE Technological University, Hubli, Karnataka, India
Keywords:
Cultural and Archaeological Sites Restoration, Image Reconstruction, UNetGAN, SSIM, IHDS.
Abstract:
Archaeological sites, rich in historical and cultural significance, face deterioration from invasive, environmen-
tal, biological and natural factors, necessitating innovative restoration methods. The proposed paper introduces
a U-Net-based Generative Adversarial Network (GAN) framework to reconstruct damaged temple images, en-
suring the preservation of intricate architectural details. A custom dataset of masked images was created, and
the model was trained to reconstruct missing sections while balancing adversarial and reconstruction losses
for realistic outputs. The proposed approach addresses challenges in traditional techniques by restoring com-
plex textures, enhancing fine details and producing visually coherent results, achieving a Structural Similarity
Index Measure (SSIM) of 0.7128. Furthermore, the framework demonstrates robustness in handling various
levels of damage and noise, paving the way for scalable applications in heritage conservation. The proposed
work contributes significantly to cultural heritage preservation by combining advanced deep learning method-
ologies with precise evaluation metrics to achieve impactful results.
1 INTRODUCTION
India’s rich history, encapsulated in its archaeological
sites, particularly temples, reflects immense histori-
cal, cultural, and spiritual significance. These tem-
ples, however, face substantial degradation due to in-
vasions, natural disasters and environmental factors
like earthquakes, climatic fluctuations and biologi-
cal factors such as algae and fungal growth, lead-
ing to the loss of invaluable architectural and artis-
tic details (Smith and Patel, 2021; Sharma and Has-
san, 2022). Restoration and preservation of these
monuments have become a global priority, emphasiz-
ing their historical and cultural importance (Park and
Wang, 2021), and the need for scalable and precise
techniques to combat widespread deterioration.
Traditional restoration approaches, though effec-
tive to an extent, struggle with complex patterns and
textures. Advances in Artificial Intelligence (AI),
specifically Deep Neural Networks (DNNs), offer
transformative solutions (Mishra, 2021; El Masri and
Rakha, 2021). DNNs excel in learning patterns from
datasets to reconstruct damaged areas with remark-
able precision(Kumar and Wang, 2020), preserving
intricate details such as temple carvings (Jones and
Alvi, 2022; Chen and Smith, 2022). Techniques
like Convolutional Neural Networks (CNNs) (Lee
and Gupta, 2023) and Generative Adversarial Net-
works (GANs) (Gupta and Lee, 2023) provide excep-
tional capabilities in feature extraction, structural in-
painting, texture generation, and reconstructing the
missing areas (Hassan and Sharma, 2022), ensuring
authenticity and structural fidelity in archaeological
applications (Chen and Gupta, 2023).
U-Net architectures have demonstrated excep-
tional promise in restoring intricate temple carvings
and architectural details, thanks to their encoder-
decoder structure with skip connections, enabling
fine-grained reconstructions. The proposed approach
employs an optimized U-Net architecture within a
GAN framework, incorporating a U-Net-based gener-
ator and a PatchGAN discriminator. The U-Net gen-
erator ensures detailed texture and structural align-
ment, while the PatchGAN discriminator enhances
output realism. The proposed combination addresses
unique challenges, such as varying textures and com-
plex architectural features, resulting in visually con-
vincing and structurally accurate reconstructions.
Section 2 reviews recent advancements in deep
learning for image restoration. Section 3 details
the proposed methodology, which integrates U-Net
for structural restoration and GAN for realistic tex-
ture synthesis, enabling the reconstruction of dam-
aged and missing features in ancient temple imagery.
26
Kulkarni, A., Naduvinamath, P., Naik, G., Totad, S., Kulkarni, U. and Hegde, S.
Deep Learning Techniques for Archaeological Image Restoration.
DOI: 10.5220/0013608100004664
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 3, pages 26-32
ISBN: 978-989-758-763-4
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
Section 4 describes the dataset and analyzes the re-
sults, demonstrating the approach’s effectiveness in
preserving the architectural essence of heritage sites.
Section 5 concludes by summarizing the contributions
of the proposed approach.
2 BACKGROUND STUDY
A comprehensive review of literature highlights ad-
vancements in using deep learning techniques such as
U-Net architectures and GANs for tasks like recon-
struction, restoration and in-painting (Zuo and Tidde-
man, 2024; Kulkarni et al., 2023). U-Net’s encoder-
decoder structure, with skip connections, has been ex-
tensively applied to image-to-image tasks, including
restoring damaged archaeological sites and facial im-
age voids (Zhao et al., 2024; Schonfeld et al., 2020).
Multi-task approaches, such as multi-scale fusion, en-
able models to address diverse objectives while pre-
serving high-resolution details and semantic coher-
ence (Kwabena Patrick et al., 2022). Quality evalua-
tion metrics like Structural Similarity Index Measure
(SSIM) and Peak Signal-to-Noise Ratio (PSNR) are
critical for assessing model performance (Feng Cai,
2024).
Building upon these advancements, recent studies
have demonstrated the potential of hybrid architec-
tures that combine U-Net and GAN capabilities to en-
hance restoration accuracy. By leveraging GANs’ ad-
versarial training paradigm, these hybrid models gen-
erate outputs that are not only structurally consistent
but also visually realistic, addressing common chal-
lenges such as texture smoothness and color discrep-
ancies. Additionally, the inclusion of attention mech-
anisms and transformer-based layers has further im-
proved the ability of these networks to focus on crit-
ical features while ignoring irrelevant artifacts. Such
innovations have shown significant promise in han-
dling complex restorations, such as recreating intri-
cate carvings or patterns on archaeological artifacts,
ensuring a seamless integration of modern technology
with cultural preservation efforts.
The proposed methodology addresses limitations
in traditional architectures like autoencoders and
CNNs, which struggle with complex patterns and
fluctuating datasets (Zhou et al., 2021; Nguyen and
Tran, 2022). By leveraging U-Net and incorporating
GAN frameworks with adversarial and reconstruc-
tion losses, more realistic and high-fidelity outputs are
achieved (Wang and Tang, 2021; Shen and Li, 2021).
Metrics like SSIM and PSNR provide structural simi-
larity and noise-level evaluation, essential for validat-
ing results in real-world scenarios, including denois-
ing, in-painting, and restoration of corrupted images
(Lee and Kim, 2022; Huang and Zhang, 2021). Op-
timization strategies like curriculum learning and at-
tention mechanisms further enhance outcomes in low-
context scenarios(Patel and Gupta, 2022).
Furthermore, the integration of domain-specific
pretraining and transfer learning techniques has been
instrumental in improving the model’s adaptability to
niche datasets, such as those featuring archaeological
artifacts. These techniques enable the model to gen-
eralize effectively from limited training data by lever-
aging knowledge from larger, more diverse datasets.
In addition, advanced loss functions, such as percep-
tual loss and contextual loss, have been adopted to
prioritize the preservation of fine-grained details and
contextual relevance during reconstruction. This en-
sures that the restored images maintain their histor-
ical and cultural authenticity while achieving supe-
rior quantitative performance across evaluation met-
rics. The proposed methodology also demonstrates
potential scalability, making it feasible for large-scale
restoration projects involving extensive datasets.
Training stability in GAN-based models is main-
tained by balancing adversarial and reconstruction
losses, mitigating challenges like mode collapse and
vanishing gradients (Chen and Zhao, 2021). The
Adam optimizer, with parameters tuned to a learning
rate of 1e-4 and betas of 0.5 and 0.999, ensures ef-
ficient convergence and avoids instability associated
with Stochastic Gradient Descent (SGD) (Singh and
Verma, 2021; Liu and Sun, 2022). By integrating
U-Net and GAN architectures with advanced opti-
mization techniques, the approach facilitates robust
restorations tailored to demanding scenarios involv-
ing missing or noisy data (Zhang and Luo, 2022).
In addition to optimization strategies, regulariza-
tion techniques such as spectral normalization and
gradient penalty have been employed to further en-
hance the stability of GAN training. These meth-
ods effectively constrain the discriminator’s learning
process, preventing it from becoming overly domi-
nant, which can disrupt the generator’s performance.
Moreover, progressive training methodologies, where
models are trained in incremental stages with increas-
ing complexity, have shown significant improvements
in handling high-resolution image restoration tasks.
By combining these approaches with data augmen-
tation strategies, such as random masking and noise
injection, the framework ensures robust performance
across diverse datasets while preserving computa-
tional efficiency and generalization capabilities.
Deep Learning Techniques for Archaeological Image Restoration
27
3 PROPOSED WORK
The combination of the U-Net architecture with
the GAN framework provides a robust approach
to precise and realistic cultural heritage restoration.
The U-Net-based generator excels at capturing fine-
grained features while preserving spatial information,
whereas the PatchGAN (Isola et al., 2018) discrim-
inator ensures that the outputs are both realistic and
structurally coherent. The proposed combination en-
ables the model to effectively learn and reconstruct in-
tricate architectural patterns, establishing a new stan-
dard for precision in restoration tasks.
3.1 U-Net Generator Architecture
Figure 1: Architecture overview of U-Net GAN.
The U-Net architecture is integrated with the
GAN framework to reconstruct damaged image re-
gions as shown in the Figure 1. During preprocess-
ing, the pixel values of images are normalised which
are further converted into tensor formats thus making
them suitable for deep learning models.
The U-Net generator operates as an encoder-
decoder network which is enhanced with skip con-
nections. With the usage of convolutional layers,
batch normalization and leakyReLU activation, en-
coder gradually reduces the spatial dimensions of the
input image while extracting complex features.The
decoder is responsible for upsampling the encoded
image i.e., the encoded features are upsampled to their
original resolution. The decoder uses transposed con-
volutional layers with ReLU activation. Skip connec-
tions used in the network ensure that the feature infor-
mation is transferred from corresponding encoder lay-
ers to decoder layers in a thorough way ensuring that
the spatial features are retained.The generator uses a
TanH activation function to generate the reconstructed
images by scaling values between -1 and 1.
3.2 Discriminator Network
A discriminator network is trained alongside with
generator to ensure that the generated images ap-
pear realistic. The discriminator consists of multi-
ple convolutional layers that gradually reduce input
image into a scalar probability score providing an
indication of an image being real (ground truth) or
fake(reconstructed by generator). The proposed ad-
versarial design stimulates generator to improve its
reconstructions, causing difficulty for discriminator to
distinguish real images from fake ones.
During the forward pass for real images,the dis-
criminator calculates the real loss as denoted in Equa-
tion 1
L
real
=
1
N
N
i=1
logD(y
i
) (1)
where D(y
i
) represents the discriminator’s output for
the real image y
i
, and N is the total number of real
images in the batch.The motive is to make D(y
i
) close
to 1.
It is implemented using Binary Cross-
Entropy(BCE) loss denoted below in Equation 2.
L
real
= BCE(D(y
i
), 1), (2)
Then, in the forward pass of fake images, the
generator first generates fake images (G(x
i
)) from
masked images(x
i
). Then, the discriminator will
measure those generated images to compute the fake
loss(L
f
ake) as shown in the Equation 3
L
fake
=
1
N
N
i=1
log(1 D(G(x
i
))), (3)
The aim is to encourage D(G(x
i
)) to be close to 0,
which means that the discriminator must identify the
generated images as fake. It is also calculated using
Binary Cross Entropy loss as indicated in Equation 4
L
fake
= BCE(D(G(x
i
)), 0), (4)
Finally, the total discriminator loss (L
D
)combines
the real and fake losses to maximize the discrimina-
tor’s performance at being able to distinguish between
real and generated images. It is calculated as shown
in Equation 5:
L
D
=
1
2
(L
real
+ L
fake
), (5)
The combined loss is decreased through backward
propagation which enables the discriminator to distin-
guish between real and fake images.
INCOFT 2025 - International Conference on Futuristic Technology
28
3.3 Adversarial Training
The generator is trained to fool the discriminator
while simultaneously producing realistic images that
resemble the target images. In the forward pass, ad-
versarial loss L
adv
is calculated using the output gen-
erated by the discriminator on the fake images.The
adversarial loss formula ,as shown in Equation 6:
L
adv
=
1
N
N
i=1
logD(G(x
i
)), (6)
The loss aids the generator in learning to gen-
erate pictures that confuses the discriminator. It is
calculated using Binary Cross-Entropy(BCE) loss, as
shown in Equation 7:
L
adv
= BCE(D(G(x
i
)), 1), (7)
where the target label is set to 1 for marking that the
fake images should be classified as actual.
3.4 Reconstruction Loss
The reconstruction loss in-terms of pixel-wise accu-
racy ensures that the reconstructed images match the
ground truth.The generator tries to minimize the dif-
ference between the generated and the actual target
images using the reconstruction loss L
r
ec. the for-
mula for the loss, as shown in Equation 8:
L
rec
=
1
N
N
i=1
G(x
i
) y
i
1
, (8)
where G(x
i
) y
i
1
denotes the L1 norm (Mean Ab-
solute Error), which minimizes the pixel-wise differ-
ence between the generated image and the target im-
age.
Finally, a total generator loss is calculated that
combines both adversarial and reconstruction losses
,as shown in Equation 9:
L
G
= λ
rec
· L
rec
+ λ
adv
· L
adv
, (9)
where λ
rec
and λ
adv
are the weights that control the
influence of the reconstruction and adversarial losses,
respectively. Using the loss, the generator adjusts its
parameters to maximize both realism and fidelity to
the target images.
3.5 Evaluation Metrics
The PSNR and SSIM metrics are used to evaluate the
performance of the model. PSNR checks the differ-
ence between the reconstructed and target image to
estimate the pixel-level accuracy. It calculates the ra-
tio between the maximum possible pixel value of the
image and the Mean Squared Error (MSE) between
the generated and target images. SSIM whose value is
between 0 to 1, measures the kind of structural and vi-
sual quality of the original in the reconstruction. The
SSIM is responsible for measuring the visual similar-
ity between the generated image and the target image.
The overall methodology portrays an effective
blend of architectural design and adversarial train-
ing aimed at achieving high-quality image reconstruc-
tions.
4 RESULTS AND ANALYSIS
The proposed work outlines the experimental results
of the proposed framework, including details on the
dataset, prepossessing steps, and the training proce-
dure. The model’s performance is evaluated using
SSIM as the primary metric to assess the quality of
restoration and its effectiveness in addressing com-
plex challenges.
4.1 Dataset Description
The IHDS dataset, containing 3,000 high-resolution
images of temples as shown in Figure 2, was utilized
to train the model. Since the dataset only provided
intact temple images (target images), corresponding
masked images were generated to simulate structural
damage. To create realistic masked images, methods
such as adding inconsistent or random patches were
avoided, as they do not accurately represent true dis-
tortions. Instead, background patches were overlaid
on temple regions, effectively mimicking structural
damage or fragmentation. The proposed approach en-
sured that the masked images closely resembled real-
world scenarios of temple degradation.
Figure 2: Dataset of Damaged Archaeological Sites
4.2 Evaluation
After preparing the dataset of masked images, the
model was trained using a UNet-based GAN archi-
Deep Learning Techniques for Archaeological Image Restoration
29
tecture with a PatchGAN discriminator. During train-
ing, the U-Net generator focused on reconstructing
the missing parts of the masked images, while the
PatchGAN discriminator ensured the realism of these
reconstructions. Once training was complete, the gen-
erator was used to produce the final reconstructed
images, effectively restoring the damaged regions as
shown in Figure 3.
(a) (b)
(c) (d)
(e) (f)
Figure 3: Examples of masked images (left column) and
their corresponding reconstructed images (right column).
The SSIM scores are as follows: (b): 0.7538, (d): 0.8235,
(f): 0.8012.
The performance of the proposed UNetGAN
model is evaluated using the Structural Similarity In-
dex Measure (SSIM) to assess the quality of the re-
constructed images. SSIM evaluates the similarity be-
tween the reconstructed image and the ground truth,
reflecting the model’s effectiveness in restoring visual
details. For the provided archaeological images, the
model achieved an overall SSIM score of 0.7128 .The
score demonstrates the model’s capability to restore
omitted architectural details while successfully cap-
turing the structural and textural integrity of the orig-
inal images, aligning closely with the ground truth.
Upon visual inspection, the reconstructed images
exhibit a high level of detail preservation, effectively
capturing the intricate features of the original archi-
tectural forms. However, the analysis highlights the
influence of the quality and diversity of the train-
ing data on the model’s performance. Expanding
the dataset with a greater variety of high-resolution
ground truth images could further enhance the accu-
racy and reliability of the reconstructions.
Having established the model’s overall perfor-
mance through the SSIM, we now delve into spe-
cific examples of reconstructed images and their cor-
responding SSIM values in Figure 3. The first ex-
ample is the Kappachenikeshwara Temple in Hassan,
which achieved an SSIM of 0.7581 as shown in Figure
3(b), indicating significant reconstruction accuracy
while preserving intricate architectural details. Next,
the Veerabhadreshwara temple, Hangal recorded an
SSIM of 0.8589 as shown in Figure 3(d) reflecting
the model’s ability to capture the structural and textu-
ral intricacies of this iconic monument. Further, Haz-
ararama temple, Hampi attained an SSIM of 0.8661 as
shown in Figure 3(f), showcasing exceptional fidelity
in reconstructing the missing portions while closely
aligning with the ground truth. These examples high-
light the model’s capability to effectively restore ar-
chaeological images, emphasizing the critical role of
diverse and high-quality datasets in improving recon-
struction accuracy and reliability.
5 CONCLUSION AND FUTURE
WORK
The potential for preserving cultural property is enor-
mous when deep learning algorithms are used to
recreate damaged archaeological pictures, especially
those of temples. The efficiency of integrating U-
Net architecture with a GAN framework was shown
in the study, allowing for excellent restoration while
preserving the originals’ fine architectural elements
and aesthetic integrity. The suggested model outper-
forms conventional techniques in achieving durable
and realistic image in-painting by utilizing adversar-
ial training and reconstruction loss. Additionally, us-
ing sophisticated evaluation measure like SSIM guar-
antees thorough quality assessment, which strength-
ens the model’s dependability. The proposed model
has achieved an SSIM score of 0.7128. To further in-
crease reconstruction skills for varied cultural items,
future research can investigate improving training ef-
ficiency and combining multi-modal data.
The future scope can include advancement of
INCOFT 2025 - International Conference on Futuristic Technology
30
model architectures and techniques to capture finer
details and complex patterns, alongside improved
data preprocessing and augmentation strategies for
robust results. Expanding to diverse datasets and in-
tegrating domain-specific knowledge could enhance
contextual accuracy and generalization. Additionally,
smoothing the blurred areas after reconstruction can
improve results.
REFERENCES
Chen, L. and Zhao, Y. (2021). Stabilizing gan train-
ing for image reconstruction. Neural Networks,
142:15–25.
Chen, W. and Gupta, S. (2023). Ensuring texture con-
tinuity in archaeological image restoration. Jour-
nal of Visual Computing for Cultural Heritage,
18:214–230.
Chen, W. and Smith, G. (2022). Generative ad-
versarial networks for cultural heritage restora-
tion. IEEE Transactions on Neural Networks
and Learning Systems, 32(12):1245–1260.
El Masri, R. and Rakha, T. (2021). Historic built en-
vironment assessment and management by deep
learning techniques: A scoping review. Journal
of Cultural Heritage, 49:236–247.
Feng Cai, Jingxu Peng, P. Z. (2024). Proceedings of
the 2023 international conference on data sci-
ence, advanced algorithm and intelligent com-
puting (dai 2023). 180.
Gupta, S. and Lee, M.-h. (2023). Optimizing u-net
for image restoration in archaeological research.
Pattern Recognition Letters, 145:112–123.
Hassan, A. and Sharma, P. (2022). Digital preser-
vation of cultural heritage through image recon-
struction. Digital Heritage Review, 28(4):289–
305.
Huang, X. and Zhang, M. (2021). Quantitative eval-
uation of image restoration models. Machine
Learning Applications, 13:123–140.
Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2018).
Image-to-image translation with conditional ad-
versarial networks.
Jones, R. and Alvi, N. (2022). Age-related degrada-
tion of archaeological structures and its restora-
tion. Heritage Science, 10:35–49.
Kulkarni, U., Chikkamath, S., Mirajkar, J. S., Hit-
talmakki, Y., Thota, V., and Khan, F. (2023).
Image inpainting on archeological dataset using
unet architecture on embedded platform. In In-
ternational Conference on Recent Trends in Ma-
chine Learning, IOT, Smart Cities & Applica-
tions, pages 353–365. Springer.
Kumar, R. and Wang, L. (2020). Deep learning for ar-
chaeological image reconstruction: Applications
and challenges. Artificial Intelligence in Archae-
ology, 34(3):245–267.
Kwabena Patrick, M., Felix Adekoya, A.,
Abra Mighty, A., and Edward, B. Y. (2022).
Capsule networks–a survey.
Lee, J. and Kim, S. (2022). Evaluation metrics for ai-
driven image restoration techniques. Computer
Graphics Forum, 41:412–425.
Lee, M.-h. and Gupta, S. (2023). Convolutional
neural networks for structural fidelity in image
restoration. International Journal of Machine Vi-
sion, 12(8):145–157.
Liu, J. and Sun, H. (2022). Challenges and advances
in gan-based image reconstruction. Computer
Vision and Pattern Recognition Letters, 18:330–
345.
Mishra, A. (2021). Heritage preservation using deep
learning models for archaeological site restora-
tion. Advances in Science, Technology and En-
gineering Systems Journal, 6(4):1121–1126.
Nguyen, T. H. and Tran, P. D. (2022). Advanced
cnn techniques for cultural heritage preservation.
Journal of Computer Vision and Applications,
12:156–172.
Park, J.-h. and Wang, L. (2021). Artificial intelligence
for preserving heritage sites: A comprehensive
review. Cultural Heritage Journal, 22:101–120.
Patel, A. and Gupta, N. (2022). Restoration of cor-
rupted images using ai techniques. Journal of
Artificial Intelligence Research, 60:89–102.
Schonfeld, E., Schiele, B., and Khoreva, A. (2020).
A u-net based discriminator for generative ad-
versarial networks. In Proceedings of the
IEEE/CVF conference on computer vision and
pattern recognition, pages 8207–8216.
Sharma, P. and Hassan, A. (2022). Impact of biolog-
ical factors on cultural heritage sites. Journal of
Environmental Biology, 38(5):275–283.
Shen, L. and Li, Q. (2021). Adversarial loss optimiza-
tion in gans for image reconstruction. Pattern
Recognition Letters, 145:67–75.
Singh, R. and Verma, P. (2021). Optimization tech-
niques in deep learning models for restoration.
Journal of Machine Learning Research, 22:201–
218.
Deep Learning Techniques for Archaeological Image Restoration
31
Smith, J. and Patel, A. (2021). Cultural heritage
restoration: Challenges and priorities. Journal
of Heritage Conservation, 45(2):123–134.
Wang, H. and Tang, X. (2021). Skip connections in
u-net for enhanced detail preservation. Applied
Intelligence, 51:2924–2938.
Zhang, W. and Luo, T. (2022). Combining u-net and
gan for cultural heritage restoration. Heritage
Science, 10:45–56.
Zhao, F., Ren, H., Sun, K., and Zhu, X. (2024). Gan-
based heterogeneous network for ancient mural
restoration. Heritage Science, 12(1):418.
Zhou, Y., Zhang, W., and Wang, L. (2021). Improved
deep learning architectures for image restora-
tion. IEEE Transactions on Image Processing,
30:2345–2356.
Zuo, H. and Tiddeman, B. (2024). A u-net architec-
ture for inpainting lightstage normal maps. Com-
puters, 13(2).
INCOFT 2025 - International Conference on Futuristic Technology
32