Residential College | false |
Status | 已發表Published |
Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation | |
Zeng X.2; Wong D.F.2; Chao L.S.2; Trancoso I.1 | |
2013 | |
Conference Name | the 51st Annual Meeting of the Association for Computational Linguistics |
Source Publication | Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) |
Volume | 2 |
Pages | 171-176 |
Conference Date | August , 2013 |
Conference Place | Sofia, Bulgaria |
Abstract | This paper presents a semi-supervised Chinese word segmentation (CWS) approach that co-regularizes character-based and word-based models. Similarly to multi-view learning, the "segmentation agreements" between the two different types of view are used to overcome the scarcity of the label information on unlabeled data. The proposed approach trains a character-based and word-based model on labeled data, respectively, as the initial models. Then, the two models are constantly updated using unlabeled examples, where the learning objective is maximizing their segmentation agreements. The agreements are regarded as a set of valuable constraints for regularizing the learning of both models on unlabeled data. The segmentation for an input sentence is decoded by using a joint scoring function combining the two induced models. The evaluation on the Chinese tree bank reveals that our model results in better gains over the state-of-The-art semi-supervised models reported in the literature. © 2013 Association for Computational Linguistics. |
URL | View the original |
Indexed By | 其他 |
Language | 英語English |
Fulltext Access | |
Document Type | Conference paper |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Affiliation | 1.Instituto Superior Técnico 2.Universidade de Macau |
First Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Zeng X.,Wong D.F.,Chao L.S.,et al. Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation[C], 2013, 171-176. |
APA | Zeng X.., Wong D.F.., Chao L.S.., & Trancoso I. (2013). Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2, 171-176. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment