A method for web information extraction

doi:10.1007/978-3-540-78849-2_39

UM > Faculty of Science and Technology > DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE

Residential College	false
Status	已發表Published
	A method for web information extraction
	Man I. Lam2 ; Zhiguo Gong2 ; Maybin Muyeba 1
	2008-05-22
Conference Name	10th Asia-Pacific Web Conference and Workshops
Source Publication	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	4976 LNCS
Pages	383-394
Conference Date	APR 26-28, 2008
Conference Place	Shenyang, PEOPLES R CHINA
Abstract	The Word Wide Web has become one of the most important information repositories. However, information in web pages is free from standards in presentation and lacks being organized in a good format. It is a challenging work to extract appropriate and useful information from Web pages. Currently, many web extraction systems called web wrappers, either semi-automatic or fully-automatic, have been developed. In this paper, some existing techniques are investigated, then our current work on web information extraction is presented. In our design, we have classified the patterns of information into static and non-static structures and use different technique to extract the relevant information. In our implementation, patterns are represented with XSL files, and all the extracted information is packaged into a machine-readable format of XML.
DOI	10.1007/978-3-540-78849-2_39
URL	View the original
Indexed By	CPCI-S
Language	英語English
WOS Research Area	Computer Science
WOS Subject	Computer Science, Information Systems ; Computer Science, Theory & Methods
WOS ID	WOS:000255194500039
Scopus ID	2-s2.0-43749092353
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation	1.School of Computing Liverpool Hope University, Liverpool, L16 9JD, UK 2.Faculty of Science and Technology University of Macau, Macao, PRC
First Author Affilication	Faculty of Science and Technology
Recommended Citation GB/T 7714	Man I. Lam,Zhiguo Gong,Maybin Muyeba. A method for web information extraction[C], 2008, 383-394.
APA	Man I. Lam., Zhiguo Gong., & Maybin Muyeba (2008). A method for web information extraction. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4976 LNCS, 383-394.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh