Image Inpainting on the Sketch-Pencil Domain with Vision Transformers

Jose Campana; Luís Decker; Marcos Souza; Helena Maia; Helio Pedrini

doi:10.5220/0012363500003660

Image Inpainting on the Sketch-Pencil Domain with Vision Transformers

Jose Campana, Luís Decker, Marcos Souza, Helena Maia, Helio Pedrini

2024

Abstract

Image inpainting aims to realistically fill missing regions in images, which requires both structural and textural understanding. Traditionally, methods in the literature have employed Convolutional Neural Networks (CNN), especially Generative Adversarial Networks (GAN), to restore missing regions in a coherent and reliable manner. However, CNNs’ limited receptive fields can sometimes result in unreliable outcomes due to their inability to capture the broader context of the image. Transformer-based models, on the other hand, can learn long-range dependencies through self-attention mechanisms. In order to generate more consistent results, some approaches have further incorporated auxiliary information to guide the model’s understanding of structural information. In this work, we propose a new method for image inpainting that uses sketch-pencil information to guide the restoration of structural, as well as textural elements. Unlike previous works that employ edges, lines, or segmentation maps, we leverage the sketch-pencil domain and the capabilities of Transformers to learn long-range dependencies to properly match structural and textural information, resulting in more consistent results. Experimental results show the effectiveness of our approach, demonstrating either superior or competitive performance when compared to existing methods, especially in scenarios involving complex images and large missing areas.

Download

Paper Citation

in Harvard Style

Campana J., Decker L., Souza M., Maia H. and Pedrini H. (2024). Image Inpainting on the Sketch-Pencil Domain with Vision Transformers. In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP; ISBN 978-989-758-679-8, SciTePress, pages 122-132. DOI: 10.5220/0012363500003660

in Bibtex Style

@conference{visapp24,
author={Jose Campana and Luís Decker and Marcos Souza and Helena Maia and Helio Pedrini},
title={Image Inpainting on the Sketch-Pencil Domain with Vision Transformers},
booktitle={Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP},
year={2024},
pages={122-132},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012363500003660},
isbn={978-989-758-679-8},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP
TI - Image Inpainting on the Sketch-Pencil Domain with Vision Transformers
SN - 978-989-758-679-8
AU - Campana J.
AU - Decker L.
AU - Souza M.
AU - Maia H.
AU - Pedrini H.
PY - 2024
SP - 122
EP - 132
DO - 10.5220/0012363500003660
PB - SciTePress