SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution

Mohamed Ibrahim; Mohamed Ibrahim; Robert Benavente; Daniel Ponsa; Felipe Lumbreras

doi:10.5220/0012399300003660

SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution

Mohamed Ibrahim, Mohamed Ibrahim, Robert Benavente, Daniel Ponsa, Felipe Lumbreras

2024

Abstract

Remote sensing applications, impacted by acquisition season and sensor variety, require high-resolution images. Transformer-based models improve satellite image super-resolution but are less effective than convolutional neural networks (CNNs) at extracting local details, crucial for image clarity. This paper introduces SWViT-RRDB, a new deep learning model for satellite imagery super-resolution. The SWViT-RRDB, combining transformer with convolution and attention blocks, overcomes the limitations of existing models by better representing small objects in satellite images. In this model, a pipeline of residual fusion group (RFG) blocks is used to combine the multi-headed self-attention (MSA) with residual in residual dense block (RRDB). This combines global and local image data for better super-resolution. Additionally, an overlapping cross-attention block (OCAB) is used to enhance fusion and allow interaction between neighboring pixels to maintain long-range pixel dependencies across the image. The SWViT-RRDB model and its larger variants outperform state-of-the-art (SoTA) models on two different satellite datasets in terms of PSNR and SSIM.

Download

Paper Citation

in Harvard Style

Ibrahim M., Benavente R., Ponsa D. and Lumbreras F. (2024). SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution. In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP; ISBN 978-989-758-679-8, SciTePress, pages 575-582. DOI: 10.5220/0012399300003660

in Bibtex Style

@conference{visapp24,
author={Mohamed Ibrahim and Robert Benavente and Daniel Ponsa and Felipe Lumbreras},
title={SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution},
booktitle={Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP},
year={2024},
pages={575-582},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012399300003660},
isbn={978-989-758-679-8},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP
TI - SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution
SN - 978-989-758-679-8
AU - Ibrahim M.
AU - Benavente R.
AU - Ponsa D.
AU - Lumbreras F.
PY - 2024
SP - 575
EP - 582
DO - 10.5220/0012399300003660
PB - SciTePress