Abstract
Unsupervised machine translation (UMT) has recently attracted more attention from researchers, enabling models to translate when languages lack parallel corpora. However, the current works mainly consider close language pairs (e.g., English-German and English-French), and the effectiveness of visual content for distant language pairs has yet to be investigated. This article proposes an unsupervised multimodal machine translation model for low-resource distant language pairs. Specifically, we first employ adequate measures such as transliteration and re-ordering to bring distant language pairs closer together. We then use visual content to extend masked language modeling and generate visual masked language modeling for UMT. Finally, empirical experiments are conducted on our distant language pair dataset and the public Multi30k dataset. Experimental results demonstrate the superior performance of our model, with BLEU score improvements of 2.5 and 2.6 on translation for distant language pairs English-Uyghur and Chinese-Uyghur. Moreover, our model also brings remarkable results for close language pairs, improving 2.3 BLEU compared with the existing models in English-German.
- [1] . 2018. Unsupervised neural machine translation. In Proceedings of the 6th International Conference on Learning Representations. 1–12.Google Scholar
- [2] . 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations. 1–15.Google Scholar
- [3] . 2017. NMTPY: A flexible toolkit for advanced neural machine translation systems. Prague Bull. Math. Linguistics 109 (2017), 15–28.Google ScholarCross Ref
- [4] . 2021. Cross-lingual visual pre-training for multimodal machine translation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. 1317–1324.Google ScholarCross Ref
- [5] . 2008. Optimizing chinese word segmentation for machine translation performance. In Proceedings of the 3rd Workshop on Statistical Machine Translation. 224–232.Google ScholarDigital Library
- [6] . 2019. From words to sentences: A progressive learning approach for zero-resource machine translation with visual pivots. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. 4932–4938.Google ScholarCross Ref
- [7] . 2018. Zero-resource neural machine translation with multi-agent communication game. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18), the 30th Innovative Applications of Artificial Intelligence (IAAI’18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence. 5086–5093.Google ScholarCross Ref
- [8] . 2017. Joint training for pivot-based neural machine translation. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 3974–3980.Google ScholarDigital Library
- [9] . 2014. On the properties of neural machine translation: Encoder-decoder approaches. In Proceedings of the SSST@EMNLP 8th Workshop on Syntax, Semantics and Structure in Statistical Translation. 103–111.Google ScholarCross Ref
- [10] . 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1724–1734.Google ScholarCross Ref
- [11] . 2019. Cross-lingual language model pretraining. In Proceedings of the 32nd Annual Conference on Neural Information Processing Systems. 7057–7067.Google Scholar
- [12] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171–4186.Google Scholar
- [13] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 4171–4186.Google Scholar
- [14] . 2016. Multi30K: Multilingual English-German image descriptions. In Proceedings of the 5th Workshop on Vision and Language. 70–74.Google ScholarCross Ref
- [15] . 2016. Zero-resource translation with multi-lingual neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 268–277.Google ScholarCross Ref
- [16] . 2018. Assessing multilingual multimodal image description: Studies of native speaker preferences and translator choices. Nat. Lang. Eng. 24, 3 (2018), 393–413.Google ScholarCross Ref
- [17] . 2018. CUNI system for the WMT18 multimodal translation task. In Proceedings of the 3rd Conference on Machine Translation. 616–623.Google ScholarCross Ref
- [18] . 2018. Iterative back-translation for neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. 18–24.Google ScholarCross Ref
- [19] . 2018. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 328–339.Google ScholarCross Ref
- [20] . 2020. Unsupervised multimodal neural machine translation with pseudo visual pivoting. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 8226–8237.Google ScholarCross Ref
- [21] . 2021. Image-assisted transformer in zero-resource multi-modal translation. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 7548–7552.Google ScholarCross Ref
- [22] . 2010. Automatic evaluation of translation quality for distant language pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 944–952.Google Scholar
- [23] . 2010. Head finalization: A simple reordering rule for SOV languages. In Proceedings of the Joint 5th Workshop on Statistical Machine Translation and MetricsMATR. 244–251.Google Scholar
- [24] . 2020. When and why is unsupervised neural machine translation useless?. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation. 35–44.Google Scholar
- [25] . 2015. Adam: A method for stochastic optimization. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 1–15.Google Scholar
- [26] . 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’04), A meeting of SIGDAT, a Special Interest Group of the ACL, Held in Conjunction with ACL. 388–395.Google Scholar
- [27] . 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. 177–180.Google ScholarDigital Library
- [28] . 2020. The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. Int. J. Comput. Vision 128, 7 (2020), 1956–1981.Google ScholarCross Ref
- [29] . 2018. Unsupervised machine translation using monolingual corpora only. In Proceedings of the 6th International Conference on Learning Representations. 1–14.Google Scholar
- [30] . 2007. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of the 2nd Workshop on Statistical Machine Translation. 228–231.Google ScholarCross Ref
- [31] . 2022. Noise-robust semi-supervised multi-modal machine translation. In Proceedings of the 19th Pacific Rim International Conference on Artificial Intelligence. 155–168.Google ScholarDigital Library
- [32] . 2023. Multimodality information fusion for automated machine translation. Info. Fusion 91 (2023), 352–363.Google ScholarDigital Library
- [33] . 2021. Multi-modal and multi-perspective machine translation by collecting diverse alignments. In Proceedings of the 18th Pacific Rim International Conference on Artificial Intelligence. 311–322.Google ScholarDigital Library
- [34] . 2023. Video pivoting unsupervised multi-modal machine translation. IEEE Trans. Pattern Anal. Mach. Intell. 45, 3 (2023), 3918–3932.Google Scholar
- [35] . 2020. When does unsupervised machine translation work? In Proceedings of the 5th Conference on Machine Translation. 571–583.Google Scholar
- [36] . 2017. Zero-resource machine translation by multimodal encoder-decoder network with multimedia pivot. Mach. Transl. 31, 1-2 (2017), 49–64.Google ScholarDigital Library
- [37] . 2019. compare-mt: A tool for holistic comparison of language generation systems. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19). 35–41.Google ScholarCross Ref
- [38] . 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.Google ScholarDigital Library
- [39] . 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Conference on Neural Information Processing Systems. 91–99.Google Scholar
- [40] . 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 86–96.Google ScholarCross Ref
- [41] . 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1715–1725.Google ScholarCross Ref
- [42] . 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Association for Machine Translation in the Americas. 223–231.Google Scholar
- [43] . 2019. MASS: Masked sequence to sequence pre-training for language generation. In Proceedings of the 36th International Conference on Machine Learning. 5926–5936.Google Scholar
- [44] . 2019. Semantic neural machine translation using AMR. Trans. Assoc. Comput. Linguistics 7 (2019), 19–31.Google ScholarCross Ref
- [45] . 2019. Unsupervised multi-modal neural machine translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10482–10491.Google ScholarCross Ref
- [46] . 2021. Unsupervised neural machine translation for similar and distant language pairs: An empirical study. ACM Trans. Asian Low Resour. Lang. Inf. Process. 20, 1 (2021), 10:1–10:17.Google ScholarDigital Library
- [47] . 2014. Sequence to sequence learning with neural networks. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems. 3104–3112.Google Scholar
- [48] . 2024. Encoder-decoder calibration for multimodal machine translation. IEEE Trans. Artific. Intell. (2024), 1–9. plore.ieee.org/document/10401981Google Scholar
- [49] . 2017. Attention is all you need. In Proceedings of the 30th Annual Conference on Neural Information Processing Systems. 5998–6008.Google Scholar
- [50] . 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning. 1096–1103.Google ScholarDigital Library
- [51] . 2021. Unpaired multimodal neural machine translation via reinforcement learning. In Proceedings of the 26th International Conference on Database Systems for Advanced Applications. 168–185.Google ScholarDigital Library
- [52] . 2022. Low-resource neural machine translation with cross-modal alignment. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 10134–10146.Google ScholarCross Ref
- [53] . 2014. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguistics 2 (2014), 67–78.Google ScholarCross Ref
- [54] . 2010. Semi-automatically developing Chinese HPSG grammar from the Penn Chinese treebank for deep parsing. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 1417–1425.Google Scholar
- [55] . 2018. Joint training for neural machine translation models with monolingual data. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18), the 30th Innovative Applications of Artificial Intelligence (IAAI’18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’18). 555–562.Google ScholarCross Ref
- [56] . [n.d.]. Handling syntactic divergence in low-resource machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 1388–1394.Google Scholar
- [57] . 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1568–1575.Google ScholarCross Ref
Index Terms
- Unsupervised Multimodal Machine Translation for Low-resource Distant Language Pairs
Recommendations
Low resource machine translation of english–manipuri: A semi-supervised approach
AbstractThe language barrier is one of the practical challenges human being face during communication. To overcome this, researchers are focusing on using machines to translate a source language to a target language using the textual ...
Highlights- Backtranslation and forward-translation improve the low resource machine translation.
Unsupervised Neural Machine Translation for Similar and Distant Language Pairs: An Empirical Study
Special issue on Deep Learning for Low-Resource Natural Language Processing, Part 1 and Regular PapersUnsupervised neural machine translation (UNMT) has achieved remarkable results for several language pairs, such as French–English and German–English. Most previous studies have focused on modeling UNMT systems; few studies have investigated the effect ...
Source language adaptation approaches for resource-poor machine translation
Most of the world languages are resource-poor for statistical machine translation; still, many of them are actually related to some resource-rich language. Thus, we propose three novel, language-independent approaches to source language adaptation for ...
Comments