The Study On Web Search Results' Clustering

Posted on:2009-03-19

Degree:Master

Type:Thesis

Country:China

Candidate:H B Liu

Full Text:PDF

GTID:2178360278471325

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the incessant development of computer technology and network technology, Internet becomes the largest database in the world. Facing large numbers of information, it is more and more difficult for users to try to find and search information through scanning Web. Search engine is the main approach for people to obtain information on the Web nowadays. However, some search engines such as Google, Baidu, Yahoo, often return a long, relevant information and irrelevant information mixed search results list. Users have to check the results of the list one by one to obtain the information them need, which makes users difficult to access the information them really need. Therefore, how to make users search the necessary information through the search engine more accurately and conveniently, becomes a very important and worthy of study subject.The emergence of data mining technology provides a new way to address this problem. Data mining aims at extracting hidden, unknown, useful, unusual pattern or knowledge. As one of the basic techniques of data mining, clustering can find out the data's inner characteristic and distribution rule by contrasting the similarity and dissimilarity in data, so we can obtain the further understanding. Using clustering technology to process the search results to display them in a more reasonable way, makes it possible for users to access to the necessary information more conveniently.In this thesis, we researched on the Web search engine technology and Data mining technology, responded to this issue, proposed a search results clustering system model which could be able to cluster search results obtained from Web search engines in Chinese language environment. The main idea of our model is to obtain the search results from Web search engines as the input data. First of all, generate good description, readable cluster labels, and then assign relevant search results to corresponding cluster labels. Therefore, the search results will be returned to the user in clustering way, allow user to find the information more conveniently.To design the model, we study the two classic search results clustering algorithms——SHOC and LINGO. Considering the characters between the Chinese language and the English language, we modify and adjust the original algorithm which used in English language, so that our model can be more effective in Chinese language environment.

Keywords/Search Tags:

Clustering Arithmetic, Web Search Engine, Search results, Suffix Array, Latent Semantic Indexing

PDF Full Text Request

Related items

1	Research On Search Results Clustering And Label Extraction
2	Research On Clustering Search Engine
3	Research On Semantics-Based Search Results Clustering Methods
4	The Application Of Suffix Array In Uyghur, Kazak, Kyrgyz Search Engine
5	The Study On Web Search Results' Clustering
6	Design And Implementation Of Meta Search Engine Based On Suffix Array Clustering
7	Chinese Search Results Clustering Research Based On Improved STC
8	Research On Search Results Clustering Technology For Cloud Search Engine
9	The Study Of Latent Semantic-Based Personalized Search Key Technology
10	Research And Realization Of Web Crawler And Results Clustering In Search Engine