Residential College | false |
Status | 已發表Published |
Detecting the content related parts of web pages | |
Yong Li1; Zhiguo Gong1; Ke Qi2 | |
2005-08-29 | |
Conference Name | International Conference on Service Systems and Service Management |
Source Publication | 2005 INTERNATIONAL CONFERENCE ON SERVICES SYSTEMS AND SERVICES MANAGEMENT, VOLS 1 AND 2, PROCEEDINGS |
Pages | 1071-1074 |
Conference Date | 13-15 June 2005 |
Conference Place | Chongquing, China |
Abstract | Many web pages are semantic diverse. That is, the whole content of a web page is not consistent to address one topic. However, current search engines are page-oriented (other than topic-oriented). But, most web users retrieve their target information by topics. Therefore, how to partition web pages by semantics is one of interesting research topics. In this paper, we firstly build a tree (called Semantic Tree, ST) to partition the web page into the content parts (called Semantic Part, SP) based on the web page tags. Then we analyze the characteristics of the words (or terms) appearing on the web page in order to build a term weighting formula. Based on these term weight values we employ the similarity formula to calculate the semantic similar degree between each two SPs. Finally, we consider the balance point of precision and recall as the reference value of the similarity-threshold. Through the work above we can find the content-related parts (or segmentations) of a web page. And we achieved a satisfied result. |
Keyword | Web Mining Term Weighting Similarity |
DOI | 10.1109/ICSSSM.2005.1500159 |
Indexed By | SCIE ; CPCI-S |
Language | 英語English |
WOS Research Area | Business & Economics ; Computer Science ; Operations Research & Management Science |
WOS Subject | Business ; Computer Science, Artificial Intelligence ; Computer Science, Information Systems ; Computer Science, Interdisciplinary |
WOS ID | WOS:000231534000219 |
Scopus ID | 2-s2.0-33745241081 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Affiliation | 1.FacuIty of Science and Technology, University of Macau, China 2.System Engineering Department, Beijing jiaotong University, Beijing, 1'75#, 100044 China |
First Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Yong Li,Zhiguo Gong,Ke Qi. Detecting the content related parts of web pages[C], 2005, 1071-1074. |
APA | Yong Li., Zhiguo Gong., & Ke Qi (2005). Detecting the content related parts of web pages. 2005 INTERNATIONAL CONFERENCE ON SERVICES SYSTEMS AND SERVICES MANAGEMENT, VOLS 1 AND 2, PROCEEDINGS, 1071-1074. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment