were introduced: based on RNN, LSTM utilized,
GRU, Attention Mechanism and Transformer. And
comparisons among the mentioned models on some
public test set in different situations were made to
discuss the problems they solved and challenges they
met. The research found that traditional RNN
architecture was suffered from the long sentences’
translation tasks. And the integration of Attention
Mechanism in traditional RNN Encoder-Decoder
architecture could better treat the long-term
dependencies problem and showed a better
performance on translating long sentences. And the
development of transformer model made the NMT
model reach a new milestone. For the future
development of NMT models, the research on
training low-resources language models is crucial for
the unbalanced pretraining data. Besides, the
development of a multilingual NMT model is
significant to deal with some of low-resources
languages. Furthermore, the model’s interpretability
and controllability should be improved for a better
user experience.
REFERENCES
Bahdanau, D., Cho, K., & Bengio, Y. 2014. Neural machine
translation by jointly learning to align and translate.
arXiv preprint arXiv:1409.0473.
Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y.
2014. On the properties of neural machine translation:
Encoder-decoder approaches. arXiv preprint
arXiv:1409.1259.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D.,
Bougares, F., Schwenk, H., & Bengio, Y. 2014.
Learning phrase representations using RNN encoder-
decoder for statistical machine translation. arXiv
preprint arXiv:1406.1078.
Dowling, M., Lynn, T., Poncelas, A., & Way, A. 2018.
SMT versus NMT: Preliminary comparisons for Irish.
Hearne, M., & Way, A. 2011. Statistical machine
translation: a guide for linguists and translators.
Language and Linguistics Compass, 5(5), 205-226.
Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y.,
Chen, Z., ... & Dean, J. 2017. Google’s multilingual
neural machine translation system: Enabling zero-shot
translation. Transactions of the Association for
Computational Linguistics, 5, 339-351.
Junczys-Dowmunt, M., Dwojak, T., & Hoang, H. 2016. Is
neural machine translation ready for deployment? A
case study on 30 translation directions. arXiv preprint
arXiv:1610.01108.
Li, J., Monroe, W., & Jurafsky, D. 2016. Understanding
neural networks through representation erasure. arXiv
preprint arXiv:1612.08220.
Pang, J., Ye, F., Wang, L., Yu, D., Wong, D. F., Shi, S., &
Tu, Z. 2024. Salute the classic: Revisiting challenges of
machine translation in the age of large language models.
arXiv preprint arXiv:2401.08350.
Reiter, E. 2018. A structured review of the validity of
BLEU. Computational Linguistics, 44(3), 393-401.
Sutskever, I., Vinyals, O., & Le, Q. V. 2014. Sequence to
sequence learning with neural networks. Advances in
neural information processing systems, 27.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., ... & Polosukhin, I. 2017. Attention
is all you need. Advances in neural information
processing systems, 30.