To address the issue of insufficient robustness and generalization ability in unsupervised neural machine translation when handling significant grammatical structure and lexical differences between source and target languages, an unsupervised neural machine translation method based on an adaptive noise insertion strategy is proposed. By analyzing the grammatical structure, lexical complexity and syntactic difference between the source language and the target language, the noise insertion position and intensity can be dynamically adjusted, so as to better adapt to the complexity of different language pairs. For simpler sentences, less noise is inserted to preserve core semantics, while for more complex sentences, more intricate noise is introduced to enhance the model’s ability to learn complex language structures. This ensures that the model can retain important semantic information while improving its generalization ability and robustness. Experimental results show that compared with the baseline model, the bilingual evaluation understudy (BLEU) value is significantly improved by using the proposed method in eight benchmark translation tasks.
式中:表示通过自适应噪声插入后的掩码语言模型(Masked Language Model, MLM)的损失函数,衡量模型在噪声干扰条件下恢复原句的能力;表示无监督神经机器翻译任务的损失函数,衡量模型从源语言到目标语言的翻译质量;和表示动态调整的权重系数,用来平衡噪声任务和翻译任务在训练中的相对重要性,确保模型能够在不同的阶段更好地优化不同任务。
HEZ W, WANGX, WANGR, et al. Bridging the data gap between training and inference for unsupervised neural machine translation[C]∥Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2022:6611-6623.
[2]
ARTETXEM, LABAKAG, AGIRREE, et al. Unsupervised neural machine translation[DB/OL]. (2018-02-16)[2024-09-29].
[3]
LAMPLEG, CONNEAUA, DENOYERL, et al. Unsupervised machine translation using monolingual corpora only[DB/OL]. (2018-02-16)[2024-09-29].
[4]
LAMPLEG, OTT M, CONNEAUA, et al. Phrase-based & neural unsupervised machine translation[C]∥Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2018:5039-5049.
[5]
LAMPLEG, CONNEAUA. Cross-lingual language model pretraining[C]∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc., 2019:7059-7069.
[6]
NGUYENX P, JOTYS, NGUYENT, et al. Cross-model back-translated distillation for unsupervised machine translation[C]∥Proceedings of the 38th International Conference on Machine Learning. New York, USA: PMLR, 2021:8073-8083.
[7]
ÜSTÜNA, BERARDA, BESACIERL, et al. Multilingual unsupervised neural machine translation with denoising adapters[C]∥Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2021:6650-6662.
[8]
LIUY H, JABBARH, SCHUETZEH. Flow-adapter architecture for unsupervised machine translation[C]∥Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2022:1253-1266.
[9]
LIUY H, CHRONOPOULOUA, SCHÜTZEH, et al. On the copying problem of unsupervised NMT: a training schedule with a language discriminator loss[C]∥Proceedings of the 20th International Conference on Spoken Language Translation. Stroudsburg, USA: ACL, 2023:491-502.
[10]
NGUYENX P, JOTYS, WUK, et al. Refining low-resource unsupervised translation by language disentanglement of multilingual translation model[J]. Advances in Neural Information Processing Systems, 2022,35:36230-36242.
[11]
HANJ, BABUSCHKINI, EDWARDSH, et al. Unsupervised neural machine translation with generative language models only[DB/OL]. (2021-10-11)[2024-09-29].
[12]
KOEHNP, KNOWLESR. Six challenges for neural machine translation[C]∥Proceedings of the First Workshop on Neural Machine Translation. Stroudsburg, USA: ACL, 2017:28-39.
[13]
SENNRICHR, HADDOWB, BIRCHA. Improving neural machine translation models with monolingual data[C]∥Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2016:86-96.
[14]
ZHANGT, YEW, YANGB, et al. Frequency-aware contrastive learning for neural machine translation[C]∥Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2022:11712-11720.
[15]
VINCENTP, LAROCHELLEH, BENGIOY, et al. Extracting and composing robust features with denoising autoencoders[C]∥Proceedings of the 25th International Conference on Machine Learning. New York, USA: ACM, 2008:1096-1103.
[16]
ELLIOTTD, FRANKS, SIMA’ANK, et al. Multi30K: multilingual English-German image descriptions[C]∥Proceedings of the 5th Workshop on Vision and Language. Stroudsburg, USA: ACL, 2016:70-74.
[17]
KOEHNP, HOANGH, BIRCHA, et al. Moses: open source toolkit for statistical machine translation[C]∥Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. Stroudsburg, USA: ACL, 2007:177-180.
[18]
SONGK T, TANX, QINT, et al. MASS: masked sequence to sequence pre-training for language generation[C]∥Proceedings of the 36th International Conference on Machine Learning. New York, USA: PMLR, 2019:5926-5936.
[19]
PAPINENIK, ROUKOSS, WARDT, et al. BLEU: a method for automatic evaluation of machine translation[C]∥Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Stroudsburg, USA: ACL, 2001:No.311.