Font Size: a A A

Research On Web Page Classification Algorithms Of Professional Theme

Posted on:2006-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2168360155472069Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet, it makes people convenient to get various resources from all over the world. But it also comes to be more and more difficult to retrieve relevant information quickly because of the explosion of resources. So, higher requirements are put forward for advancing the technology of gathering information. Due to the complexity of Web pages, general search engines cannot meet the exact need of users. Hence special technology and method of information acquirement is becoming a hot research direction. In this thesis, with respect to the Chinese Olympic subject web pages, some techniques of Internet information gathering are studied.Studies in this thesis include:(1) Aim to Olympic web page filtering, variety of feature selection methods, classifiers and metrics used for classification result have been covered. And substitutes of combination of every feature selection method and every classifier were tested on data sets. Experiments show that some methods are effective while selecting features. When feature weighting is considered, different classifier may take out distinct result. The principle of each classifier is distinct from others, so features with different occurrence frequency may play different effect.(2) Considering the dynamic, sequence and timeliness of resources on Internet, a Adaptive Classification method which is based on Rocchio retrieval expansion model is put forward. Without manpower, computers could not make correct judgments when they face to complicated web pages. If improper evaluations are used to adjust classes' model, the performance may be confronted with the risk of deterioration. Positive and negative believing scopes are set and dynamic coefficients are adopted, in order to keep good performance and make the classes' models adapted to the changing data as well.
Keywords/Search Tags:Web Page Filtering, Feature Selection, Adaptive Classification
PDF Full Text Request
Related items