Font Size: a A A

Research On Search Results Clustering Technology For Cloud Search Engine

Posted on:2014-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2248330398972102Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology, there is an explosion of network information. Search engine gradually become the main tool for users to find information on web. However, the famous search engines such as Google, Baidu return results to the user in the form of a linear list. The search results are often tens of thousands, so users have to spend a lot of time to find what they really need from the returned results.Search results clustering is to cluster the search results according to the different topic, and then show the search results to users in the form of categories. Compared with the traditional search engines, search results clustering can help users to find information more quickly.The work in this paper mainly includes the following aspects:(1) how to represent search results, calculate item weight and similarity between search results. Find the optimal method suitable to search results clustering.(2) Considering the traditional vector space model and the similarity measure method ignore the front and rear position relations as well as the term semantic relationship of the search results, a new similarity measure method for search result clustering is proposed.(3) The existing Fuzzy C-means clustering algorithm can only cluster the web documents samples with a pre-known cluster number c which is impossible in practical situations. A new method based on fuzzy c-means algorithm for search results clustering is proposed in this paper. To validate the effect of the new clustering method, Carrot2cluster platform is used to gets a set of search results.
Keywords/Search Tags:search engine, search result clustering, fuzzy C-means, affinity propagation algorithm
PDF Full Text Request
Related items