Font Size: a A A

The Study On Clustering Algorithm Of The Topic Search Engine

Posted on:2012-07-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:1118330335466376Subject:Forestry equipment works
Abstract/Summary:PDF Full Text Request
When the Internet entered into our lives and gradually changed the world, we were gradually familiar with and used the search engine, the most effective tool for the information retrieval. Internet has brought us a huge revolution in the information sharing. Search engines have injected fresh blood into this revolution. In the broad array of network resources, the search engine works like the ship's compass, leading people to surf in the network. According to recent statistics,82 out of 100 users are using search engines. The search engine subscribers have reached 375 million. The search engine has become the largest network application service, and is the primary way users are using to access the information.This paper introduces the development of domestic and international search engine technologies and the status of current researches, discusses the working principles and problems of the traditional full-text search engine, introduces in detail the concept of text clustering and the principles of clustering algorithms, expounds the improvement direction of the clustering algorithms. Through the experiment, put forward the theory of the words frequency difference and this theory is applied to the extraction of keywords. By means of research on clustering algorithm, put forward the optimum density selection clustering algorithm. This algorithm coupled with the hierarchical clustering algorithm for text clustering. Optimal the text clustering and improve the performance of search engine queries. Based on some results discussed in the paper, we propose a topic-oriented text clustering algorithm for the search engine. We applied this algorithm in a search engine which is approved to be more accurate and has more professional features comparing with other search engines with the same type.
Keywords/Search Tags:Subject-oriented, Search Engine, Text Clustering, DBSCAN, Similarity
PDF Full Text Request
Related items