Research And Improve On Clustering Method Of The Search Engine's Retrieved Results

Posted on:2008-07-18

Degree:Master

Type:Thesis

Country:China

Candidate:P D Li

Full Text:PDF

GTID:2178360212995256

Subject:Computer application technology

Abstract/Summary:

Today, although many search engine systems have been trying to improve the retrieval precision, the retrieved results still include a lot of irrelevance documents mixing with the relevance ones, so this brings the web users a huge burden. Clustering of the retrieved results of search engine, the groups which are formed should have a high degree of association between members of the same groups and a low degree between members of different groups. So the users can view their interested groups and so that it will save them much time.Firstly, this paper gives an exacting document feature method that dictionary identification work together with statistics based on key-phrase, it can not only find normal key-phrase, but also can find professional terms, words for short, temporary words, new words and so on which are not in the dictionary. The structure of index is improved. By using the sequence index and files inverted index of the snippets together, it is more adaptive to the cluster of snippets of search engine.Secondly, a fast cluster algorithm, HPMC, is presented. The similar degrees of snippets are calculated, then the initialized cluster center is created by using hierarchical clustering method. An algorithm based on k-means and single pass is presented and getting the function clusters. At last, the cluster results are got by uniting functional clusters properly.Lastly, the performance of HPMC is evaluated from the aspects of temporal complexity, special complexity, cluster quality, cluster numbers, the sensitivity degree to single point and is compared with algorithm before.

Keywords/Search Tags:

Search Engine, Retrieved Result, Clustering, Key-phrase, Cluster

Related items

1	Study On Search Results Clustering Based On Formal Concept Analysis
2	Key Phrase Extraction and Co-clustering for Web Search Result Visualization
3	Research On Search Results Clustering Technology For Cloud Search Engine
4	The Research Of A Multi-language Supporting Description-oriented Clustering Algorithm On Meta-Search Engine Result
5	Design And Implementation Of Web Search Results Clustering For Distributed Search Engine
6	Fuzzy Clustering Algorithm And Recommended Techniques Based On Search Engine Result Ranking
7	Based On The Clustering Personalized Search Engine Research And Design
8	Research And Application Of An Elimination Algorithm For Redundant Information On Search Engine's Result
9	Clustering Web documents: A phrase-based method for grouping search engine results
10	Research On The Key Technology Of Meta Search And Their Implementation