Residential Collegefalse
Status已發表Published
Data selection via semi-supervised recursive autoencoders for SMT domain adaptation
Lu Y.; Wong D.F.; Chao L.S.; Wang L.
2014
Conference Name10th China Workshop on Machine Translation (CWMT)
Source PublicationMACHINE TRANSLATION, CWMT 2014
Volume493
Pages13-23
Conference DateNOV 04-06, 2014
Conference PlaceMacau, PEOPLES R CHINA
Abstract

In this paper, we present a novel data selection approach based on semi-supervised recursive autoencoders. The model is trained to capture the domain specific features and used for detecting sentences, which are relevant to a specific domain, from a large general-domain corpus. The selected data are used for adapting the built language model and translation model to target domain. Experiments were conducted on an in-domain (IWSLT2014 Chinese-English TED Talk) and a generaldomain corpus (UM-Corpus). We evaluated the proposed data selection model in both intrinsic and extrinsic evaluations to investigate the selection successful rate (F-score) of pseudo data, as well as the translation quality (BLEU score) of adapting SMT systems. Empirical results reveal the proposed approach outperforms the state-of-the-art selection approach.

KeywordData Selection Domain Adaptation Recursive Autoencoders Semi-supervise Statistical Machine Translation
DOI10.1007/978-3-662-45701-6_2
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaComputer Science
WOS SubjectComputer Science, Artificial Intelligence ; Computer Science, Theory & Methods
WOS IDWOS:000357580100002
Scopus ID2-s2.0-84914809910
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
AffiliationUniversidade de Macau
First Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Lu Y.,Wong D.F.,Chao L.S.,et al. Data selection via semi-supervised recursive autoencoders for SMT domain adaptation[C], 2014, 13-23.
APA Lu Y.., Wong D.F.., Chao L.S.., & Wang L. (2014). Data selection via semi-supervised recursive autoencoders for SMT domain adaptation. MACHINE TRANSLATION, CWMT 2014, 493, 13-23.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Lu Y.]'s Articles
[Wong D.F.]'s Articles
[Chao L.S.]'s Articles
Baidu academic
Similar articles in Baidu academic
[Lu Y.]'s Articles
[Wong D.F.]'s Articles
[Chao L.S.]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Lu Y.]'s Articles
[Wong D.F.]'s Articles
[Chao L.S.]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.