Font Size: a A A

Research On Web Relevant Mining Technology Based On Cluster

Posted on:2006-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhaoFull Text:PDF
GTID:2168360152486713Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Nowadays, with the popularization of Internet and development of computer technology,obtaining satisfactory information from Internet becomes more difficult. To find information,select information and retrieve information effectually, and raise the respond speed ofinformation retrieval, and then get useful knowledge from relativity of web information,traditional data mining technology is used in web domain, which comes into being the webmining technology. One of the key problems of web mining is web relevant mining. In web relevant miningprocess, extracting some important web features to figure hypostatic features of web text isprincipal. Then, according to these features, map web text objects to points in highdimensional space. Finally, calculate the relativity of web text by the distances of thesepoints. By web relevant mining, we can contact the source web to its similar webs and finduseful knowledge ultimately. In this dissertation, we commence from the basic concept of data mining and webmining technology, clarifying their main contents. Then, we discuss the correlative datadenotation, distance measurement, and some common arithmetic in cluster analysisalgorithm which is one of the important means of data mining. Afterward, we resolve threekey problems of web text features extracting, that is, how to select suitable model for webtext features, how to extract suitable features from web and how to calculate the weightvalue of web features. Finally, we design and carry out a prototype system for web relevantmining.
Keywords/Search Tags:Cluster, Web mining, Relevant mining, Feature extract, Vector space model
PDF Full Text Request
Related items