erformance.
4 CONCLUSIONS
As an important part of the computer vision field,
image-style migration technology has made
significant progress and demonstrated powerful
creativity in several application scenarios. This paper
comprehensively analyzes the design ideas,
methodological innovations, and development stages
of each model by systematically reviewing the current
mainstream style migration models, including
AdaAttN model, CAST model, StyTr2 model,
StyleID model, and StyleShot model. Through
comparison, it can be found that the models from only
supporting a single art style image generation, to the
current stage can realize any style of image
generation. To compare the models, this paper
introduces CLIP scores and ArtFID scores as the key
evaluation indexes, enabling quantitative analysis of
the model performance in complex style migration
tasks. The advantages and limitations of the different
models in handling diverse styles, as well as content
and style fidelity, are revealed. Among them,
StyleShot and StyleID models are strong contenders
in the field. Both can generate superior images of
arbitrary styles. By exploring these models in depth,
this paper not only clarifies the challenges of the
current technology but also points out the different
application scenarios that can be carried out in the
direction of future research. Finally, the hope is that
this research will contribute to the advancement of
computer vision and lead to more pertinent
discussions.
REFERENCES
Chung, J., Hyun, S. & Heo, J., 2023. Style Injection in
Diffusion: A Training-free Approach for Adapting
Large-scale Diffusion Models for Style Transfer.
ArXiv, abs/2312.09008.
Deng, Y., Tang, F., Dong, W., Ma, C., Pan, X., Wang, L. &
Xu, C., 2021. StyTr2: Image Style Transfer with
Transformers. Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 11316-11326.
Gao, J., Liu, Y., Sun, Y., Tang, Y., Zeng, Y., Chen, K. &
Zhao, C., 2024. StyleShot: A Snapshot on Any Style.
ArXiv, abs/2407.01414.
Gatys, L.A., Bethge, M., Hertzmann, A. & Shechtman, E.,
2016. Preserving Color in Neural Artistic Style
Transfer. ArXiv, abs/1606.05897.
Hessel, J., Holtzman, A., Forbes, M., Le Bras, R. & Choi,
Y., 2021. CLIPScore: A Reference-free Evaluation
Metric for Image Captioning. ArXiv, abs/2104.08718.
Huang, X. & Belongie, S.J., 2017. Arbitrary Style Transfer
in Real-Time with Adaptive Instance Normalization.
Proceedings of the IEEE International Conference on
Computer Vision (ICCV), pp. 1510-1519.
Liu, S., Lin, T., He, D., Li, F., Wang, M., Li, X., Sun, Z.,
Li, Q. & Ding, E., 2021. AdaAttN: Revisit Attention
Mechanism in Arbitrary Neural Style Transfer.
Proceedings of the IEEE/CVF International Conference
on Computer Vision (ICCV), pp. 6629-6638.
Liu, G., Xia, M., Zhang, Y., Chen, H., Xing, J., Wang, X.,
Yang, Y. & Shan, Y., 2023. StyleCrafter: Enhancing
Stylized Text-to-Video Generation with Style Adapter.
ArXiv, abs/2312.00330.
Ngweta, L., Maity, S., Gittens, A., Sun, Y. & Yurochkin,
M., 2023. Simple Disentanglement of Style and
Content in Visual Representations. ArXiv,
abs/2302.09795.
Rombach, R., Blattmann, A., Lorenz, D., Esser, P. &
Ommer, B., 2021. High-Resolution Image Synthesis
with Latent Diffusion Models. Proceedings of the
IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 10674-10685.
Strudel, R., Garcia Pinel, R., Laptev, I. & Schmid, C., 2021.
Segmenter: Transformer for Semantic Segmentation.
Proceedings of the IEEE/CVF International Conference
on Computer Vision (ICCV), pp. 7242-7252.
Wright, M. & Ommer, B., 2022. ArtFID: Quantitative
Evaluation of Neural Style Transfer. Proceedings of the
German Conference on Pattern Recognition (GCPR).
Xin, H.T. & Li, L., 2023. Arbitrary Style Transfer with
Fused Convolutional Block Attention Modules. IEEE
Access, 11, pp. 44977-44988.
Zhang, Y., Tang, F., Dong, W., Huang, H., Ma, C., Lee, T.
& Xu, C., 2022. Domain Enhanced Arbitrary Image
Style Transfer via Contrastive Learning. ACM
SIGGRAPH 2022 Conference Proceedings.