Residential Collegefalse
Status已發表Published
A method for web information extraction
Man I. Lam2; Zhiguo Gong2; Maybin Muyeba1
2008-05-22
Conference Name10th Asia-Pacific Web Conference and Workshops
Source PublicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4976 LNCS
Pages383-394
Conference DateAPR 26-28, 2008
Conference PlaceShenyang, PEOPLES R CHINA
Abstract

The Word Wide Web has become one of the most important information repositories. However, information in web pages is free from standards in presentation and lacks being organized in a good format. It is a challenging work to extract appropriate and useful information from Web pages. Currently, many web extraction systems called web wrappers, either semi-automatic or fully-automatic, have been developed. In this paper, some existing techniques are investigated, then our current work on web information extraction is presented. In our design, we have classified the patterns of information into static and non-static structures and use different technique to extract the relevant information. In our implementation, patterns are represented with XSL files, and all the extracted information is packaged into a machine-readable format of XML.

DOI10.1007/978-3-540-78849-2_39
URLView the original
Indexed ByCPCI-S
Language英語English
WOS Research AreaComputer Science
WOS SubjectComputer Science, Information Systems ; Computer Science, Theory & Methods
WOS IDWOS:000255194500039
Scopus ID2-s2.0-43749092353
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation1.School of Computing Liverpool Hope University, Liverpool, L16 9JD, UK
2.Faculty of Science and Technology University of Macau, Macao, PRC
First Author AffilicationFaculty of Science and Technology
Recommended Citation
GB/T 7714
Man I. Lam,Zhiguo Gong,Maybin Muyeba. A method for web information extraction[C], 2008, 383-394.
APA Man I. Lam., Zhiguo Gong., & Maybin Muyeba (2008). A method for web information extraction. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4976 LNCS, 383-394.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Man I. Lam]'s Articles
[Zhiguo Gong]'s Articles
[Maybin Muyeba]'s Articles
Baidu academic
Similar articles in Baidu academic
[Man I. Lam]'s Articles
[Zhiguo Gong]'s Articles
[Maybin Muyeba]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Man I. Lam]'s Articles
[Zhiguo Gong]'s Articles
[Maybin Muyeba]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.