4 CONCLUSIONS
With the rapid development of natural language
processing and computer vision, this article reviews
the T2I method founded on adversarial generative
networks. According to the different requirements of
text -generating images, the GAN network generated
based on text images is divided into three major
functions: improving content authenticity, enhancing
semantic correlation, and promoting content
diversity. It can be seen through the data in the chart.
Image generation technical performance is
continuously improved effectively.
While the quality, consistency, and semantics of
the picture have all significantly improved with the
present technique, there are still many difficulty
points and the need for application expansion. In
terms of content authenticity, in many application
scenarios, such as interactive game image
construction and medical image analysis, it is
necessary to generate fine and real image generation.
In terms of semantic correlation, text image
generation technology can improve the efficiency of
scene retrieval, increase the ability of artificial
intelligence to understand the ability to understand
artificial intelligence through text interaction, and
have strong theoretical research value. For example,
using text to generate videos has important research
value. It is one of the future research directions, but
more text and video evaluation methods need to be
explored.
In terms of content diversity, diversified
production outputs in the fields of art and design help
inspire the creators' inspiration and promote the
formation of creativity. In the field of human-
computer interaction, text images can be added to
human-computer interaction. For example, entering
simple texts to generate a rich semantic image, has
increased the ability to understand artificial
intelligence, giving artificial intelligence semantics
"imagination" And "creativity" an effective means to
study the deep learning of machines. It is hoped that
the content of this article will help researchers
understand the cutting-edge technologies in the field
and provide a reference for further research.
REFERENCES
Atwood J., Towsley D. 2015. Diffusion-Convolutional
Neural Networks. arXiv E-Prints, arXiv:1511.02136.
Cha M., Gwon Y. L., Kung H. T. 2018. Adversarial
Learning of Semantic Relevance in Text to Image
Synthesis. arXiv E-Prints, arXiv:1812.05083.
Cheng J., Wu F., Tian Y., Wang L., Tao D. 2020.
RiFeGAN: Rich Feature Generation for T2I Synthesis
From Prior Knowledge. 2020 IEEE/CVF Conference
on Computer Vision and Pattern Recognition (CVPR),
10908–10917.
Kim Y. 2014. Convolutional Neural Networks for Sentence
Classification. arXiv E-Prints, arXiv:1408.5882.
Na S., Do M., Yu K., Kim J. 2022. Realistic image
generation from text by using BERT-based embedding.
Electronics, 11(5), 764.
Qiao T., Zhang J., Xu D., Tao D. 2019. MirrorGAN:
Learning T2I Generation by Redescription. arXiv E-
Prints, arXiv:1903.05854.
Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z.
2016. Rethinking the Inception Architecture for
Computer Vision. 2016 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2818–2826.
Tan H., Liu X., Liu M., Yin B., Li X. 2021. KT-GAN:
Knowledge-Transfer Generative Adversarial Network
for T2I Synthesis. IEEE Transactions on Image
Processing, 30, 1275–1290.
Yu Y., Yang Y., Xing J. 2024. PMGAN: pretrained model-
based generative adversarial network for T2I
generation. The Visual Computer.
Yuan H., Zhu H., Yang S., Wang Z., Wang N. 2024. RII-
GAN: Multi-scaled aligning-based reversed image
interaction network for T2I synthesis. Neural
Processing Letters, 56(1).
Zhang H., Xu T., Li H., Zhang S., Wang X., Huang X.,
Metaxas D. N. 2019. StackGAN++: Realistic Image
Synthesis with Stacked Generative Adversarial
Networks. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 41(8), 1947–1962.
Zhang H., Zhu H., Yang S., Li W. 2021. DGattGAN:
Cooperative Up-Sampling Based Dual Generator
Attentional GAN on T2I Synthesis. IEEE Access, 9,
29584–29598.
Zhang Y., Han S., Zhang Z., Wang J., Bi H. 2022. CF-
GAN: cross-domain feature fusion generative
adversarial network for T2I synthesis. Vis. Comput.,
39(4), 1283–1293.
Zhou H., Wu T., Ye S., Qin X., Sun K. 2024. Enhancing
fine-detail image synthesis from text descriptions by
text aggregation and connection fusion module. Signal
Processing: Image Communication, 122, 117099.
Zhu M., Pan P., Chen W., Yang Y. 2019. DM-GAN:
Dynamic Memory Generative Adversarial Networks
for T2I Synthesis. 2019 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR),
5795–5803.