Font Size: a A A

Research And Realization Of Page Clustering System Based On Web

Posted on:2006-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:H F WangFull Text:PDF
GTID:2178360182476556Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of network and overspreading of information, the users onInternet find it is a difficult problem to acquire useful information quickly andefficiently in such a sea of information. With the existent searching engine, the usersmay roughly find what they want on the Internet. However, the resources got in such away are not exactly fit for the users. Such functions as structural information, textclassification and percolating cannot be offered to the users. As the main form ofinformation resources—documents, the tool that people can catch knowledge quicklyand efficiently from web documents is required.Having done deep research on clustering analysis in the field of data mining, this paperpresents a web clustering system based on agent technology, focus of which isclustering algorithm. It clusters similar webs automatically and submits the results touser interface finally. Algorithm applies vector space model to represent web documentsfirstly. Then fuzzy clustering algorithm mines documents of high similarity, dividesthem into rough clusters and throws the evaluation to the rough results to the fuzzyalgorithm again, partitioning these documents of rough similarity into several clusterscontinuously to enlarge similarity of documents in one cluster and reduce it in differentclusters. Finally things of one kind come together.Having the hierarchical agglomerative clustering as the mining tool, we may cluster thesearching results in an online, interactive, textual and hierarchical manner, so that thedifficult problems arising from searching can be tackled.
Keywords/Search Tags:information retrieval, text mining, web mining, clustering
PDF Full Text Request
Related items