loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Ryouichi Furukawa and Kazuhiro Hotta

Affiliation: Meijo University, 1-501 Shiogamaguchi, Tempaku-ku, Nagoya 468-8502, Japan

Keyword(s): Transformer, Self Attention, Depth Wise Convolution, Shift Operation.

Abstract: In this paper, we propose ShuffleFormer, which replaces Transformer’s Self Attention with the proposed shuffle mixing. ShuffleFormer can be flexibly incorporated as the backbone of conventional visual recognition, precise prediction, etc. Self Attention can learn globally and dynamically, while shuffle mixing employs Depth Wise Convolution to learn locally and statically. Depth Wise Convolution does not consider the relationship between channels because convolution is applied to each channel individually. Therefore, shuffle mixing can obtain the information on different channels without changing the computational cost by inserting a shift operation in the spatial direction of the channel direction components. However, by using the shift operation, the amount of spatial components obtained is less than that of Depth Wise Convolution. ShuffleFormer uses overlapped patch embedding with a kernel larger than the stride width to reduce the resolution, thereby eliminating the disadvantages of using the shift operation by extracting more features in the spatial direction. We evaluated ShuffleFormer on ImageNet-1K image classification and ADE20K semantic segmentation. ShuffleFormer has superior results compared to Swin Transformer. In particular, ShuffleFormer-Base/Light outperforms Swin-Base in accuracy at about two-thirds of the computational cost. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.145.95.7

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Furukawa, R. and Hotta, K. (2023). Shuffle Mixing: An Efficient Alternative to Self Attention. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP; ISBN 978-989-758-634-7; ISSN 2184-4321, SciTePress, pages 700-707. DOI: 10.5220/0011720200003417

@conference{visapp23,
author={Ryouichi Furukawa. and Kazuhiro Hotta.},
title={Shuffle Mixing: An Efficient Alternative to Self Attention},
booktitle={Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP},
year={2023},
pages={700-707},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011720200003417},
isbn={978-989-758-634-7},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP
TI - Shuffle Mixing: An Efficient Alternative to Self Attention
SN - 978-989-758-634-7
IS - 2184-4321
AU - Furukawa, R.
AU - Hotta, K.
PY - 2023
SP - 700
EP - 707
DO - 10.5220/0011720200003417
PB - SciTePress