3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset

UM > Faculty of Science and Technology

Residential College	false
Status	已發表Published
	3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset
	Ma, Xinyu1 ; Liu, Xuebo 2; Wong, Derek F.1 ; Rao, Jun 2; Li, Bei 3; Ding, Liang 4; Chao, Lidia S.1; Tao, Dacheng 4; Zhang, Min 2
	2024
Conference Name	LREC-COLING 2024 - Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation
Source Publication	2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
Pages	1-13
Conference Date	20-25 May, 2024
Conference Place	Hybrid, Torino
Country	Italy
Publisher	European Language Resources Association (ELRA)
Abstract	Multimodal machine translation (MMT) is a challenging task that seeks to improve translation quality by incorporating visual information. However, recent studies have indicated that the visual information provided by existing MMT datasets is insufficient, causing models to disregard it and overestimate their capabilities. This issue presents a significant obstacle to the development of MMT research. This paper presents a novel solution to this issue by introducing 3AM, an ambiguity-aware MMT dataset comprising 26,000 parallel sentence pairs in English and Chinese, each with corresponding images. Our dataset is specifically designed to include more ambiguity and a greater variety of both captions and images than other MMT datasets. We utilize a word sense disambiguation model to select ambiguous data from vision-and-language datasets, resulting in a more challenging dataset. We further benchmark several state-of-the-art MMT models on our proposed dataset. Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets. Our work provides a valuable resource for researchers in the field of multimodal learning and encourages further exploration in this area. The data, code and scripts are freely available at https://github.com/MaxyLee/3AM.
Keyword	Multimodal Datasets Multimodal Machine Translation
URL	View the original
Language	英語English
Scopus ID	2-s2.0-85195953672
Fulltext Access	View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding Author	Liu, Xuebo; Wong, Derek F.
Affiliation	1.NLP2CT Lab, Department of Computer and Information Science, University of Macau, Macao 2.Institute of Computing and Intelligence, Harbin Institute of Technology, Shenzhen, China 3.Northeastern University, Shenyang, China 4.The University of Sydney, Sydney, Australia
First Author Affilication	University of Macau
Corresponding Author Affilication	University of Macau
Recommended Citation GB/T 7714	Ma, Xinyu,Liu, Xuebo,Wong, Derek F.,et al. 3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset[C]:European Language Resources Association (ELRA), 2024, 1-13.
APA	Ma, Xinyu., Liu, Xuebo., Wong, Derek F.., Rao, Jun., Li, Bei., Ding, Liang., Chao, Lidia S.., Tao, Dacheng., & Zhang, Min (2024). 3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset. 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings, 1-13.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh