Residential College | false |
Status | 已發表Published |
Data selection via semi-supervised recursive autoencoders for SMT domain adaptation | |
Lu Y.; Wong D.F.; Chao L.S.; Wang L. | |
2014 | |
Conference Name | 10th China Workshop on Machine Translation (CWMT) |
Source Publication | MACHINE TRANSLATION, CWMT 2014 |
Volume | 493 |
Pages | 13-23 |
Conference Date | NOV 04-06, 2014 |
Conference Place | Macau, PEOPLES R CHINA |
Abstract | In this paper, we present a novel data selection approach based on semi-supervised recursive autoencoders. The model is trained to capture the domain specific features and used for detecting sentences, which are relevant to a specific domain, from a large general-domain corpus. The selected data are used for adapting the built language model and translation model to target domain. Experiments were conducted on an in-domain (IWSLT2014 Chinese-English TED Talk) and a generaldomain corpus (UM-Corpus). We evaluated the proposed data selection model in both intrinsic and extrinsic evaluations to investigate the selection successful rate (F-score) of pseudo data, as well as the translation quality (BLEU score) of adapting SMT systems. Empirical results reveal the proposed approach outperforms the state-of-the-art selection approach. |
Keyword | Data Selection Domain Adaptation Recursive Autoencoders Semi-supervise Statistical Machine Translation |
DOI | 10.1007/978-3-662-45701-6_2 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Computer Science |
WOS Subject | Computer Science, Artificial Intelligence ; Computer Science, Theory & Methods |
WOS ID | WOS:000357580100002 |
Scopus ID | 2-s2.0-84914809910 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Affiliation | Universidade de Macau |
First Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Lu Y.,Wong D.F.,Chao L.S.,et al. Data selection via semi-supervised recursive autoencoders for SMT domain adaptation[C], 2014, 13-23. |
APA | Lu Y.., Wong D.F.., Chao L.S.., & Wang L. (2014). Data selection via semi-supervised recursive autoencoders for SMT domain adaptation. MACHINE TRANSLATION, CWMT 2014, 493, 13-23. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment