Na(?)ve Bayesian-based Automatic Webpage Classification Technology Research

Posted on:2009-06-12

Degree:Master

Type:Thesis

Country:China

Candidate:J S Li

Full Text:PDF

GTID:2178360245974743

Subject:Control theory and control engineering

Abstract/Summary:

Text and Webpage classification is an important technology based on text mining and Web mining, and one of the focuses of development in data mining research. By the high speed in development of data analysis tools,new database technology and internet technology, a large number of different forms of the complex types of data continue to emerge like: Semi-structured and structured data, hypertext and multimedia data, a very important problem in data mining area is data mining of complex data types; this includes complex objects, spatial data, multimedia data, time-series data, text data and Web data. Our research is try to find a way to build a model of Text and Webpage classification which based on a certain classification algorithm, and how to use the information of text content, URL link, and user usage, combined them to reflect the categories of Web pages. At last we also try to build a filtration system of Web pages.This paper describes a method for Chinese Webpage classification that uses user usage information and hierarchy from website, rather than the content-based analysis approach and the link-based analysis approach; we have to find a way to use other information like user's usage and hierarchy from the website to try to improve the performance and features of classifier. This paper tests this method and gains a result to analysis.In addition, expansion of the research, analysis a Web classification-based method of filtering technology research, and explore the way how to make use of user information to improve the accuracy of the filter approach.

Keywords/Search Tags:

Data Mining, Web Classification, Na(?)ve Bayesian, Filtration

Related items

1	Research On The Approach Of Classification In Data Mining Based On Naive Bayesian
2	A Novel Multi-grouped Graph Bayesian Classi?cation Model
3	Improvement Research And Application Of Bayesian Classification Based On Different Scenes
4	Research On Bayesian Classification Algorithm Based On Emerging Pattern For Variable Data Stream
5	Research On Multi-Relational Data Mining With Bayesian Method
6	Research Of Application Foundation On Bayesian Networks
7	Research On Data Mining Of Cotton-spinning Quality
8	Application Of Data Mining In Bank PCRM
9	Study Of Data Mining Using Bayesian Method And Its Application
10	Research On Algorithms For Relational Data Classification