Font Size: a A A

The Study And Implementation Of Web Information Extraction Mechanism Based On Classification Semantics

Posted on:2006-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:A X MaFull Text:PDF
GTID:2168360155458107Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The Internet has changed the developing progress of human society. Nowadays World Wide Web has become a huge information service web full of all kinds of information resources with web-stations standing all around the world. It's the richest source of information. In order that users on the internet can easily get what they want from all kinds of information, search engines are introduced and develop rapidly.But with the high increasing speed of quantity and categories of information, quickly obtaining what users need on WWW is getting more and more difficult. Nowadays, many search engines based on keywords usually return all the web pages including certain keywords. These results involve a variety of fields, many of which are not interesting to users at all. How to quickly, accurately find the needed information from many information resources has become a difficult problem that puzzled the Internet users. In order to improve the performance of search engine this paper apply a new technology of web-page classification to the existing search engine.So based on the studies of key technologies of search engines, this thesis designed a Search Engine System which is based on Classification Semantics. This paper discusses the combination of the new web-page automatic classification with information collecting, and presents the ranking method of searched results based on classification. Then users can easily find what they need on WWW.This thesis designs the SECS system in every particular, shows the further research on the Web information extraction mechanism based on Classification Semantics and describes the research of classification semantics extraction and the implementation of CSpider in detail also.
Keywords/Search Tags:search engine, classification semantics, spider, automatic classification, information extraction
PDF Full Text Request
Related items