Font Size: a A A

Research And Application Of An Elimination Algorithm For Redundant Information On Search Engine's Result

Posted on:2012-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:Q GuoFull Text:PDF
GTID:2178330332986026Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Nowadays, with the explosion growth of network information and the development of information diversification, it is more and more difficult to obtain information quickly and effectively, general search engines did not adapt to the accuracy request of users, eliminating the redundant information of search result is becoming one of hot spots for study. Clustering technology is the key technology to eliminate redundant information of search engine's result, which also plays an important role in relevance of search results and effectiveness of search information.The primary research work made by author of this paper is summarized as the followings:1) This paper analyzes the development of the search engines and the deficiencies of eliminating algorithms for the redundant information, studies the eliminating method of redundant information in the results of search engines by building a framework system to realize it.2) Describe the design the process in three parts:word-segmenting processing, feature extraction and eliminating redundant information. The word-segmenting using improved maximum matching (MM) algorithm to deal with the segmentation and correction on ambiguous words. Feature extraction using the vector space model to represent the feature words.3) This paper starts from the center-based K-Means clustering algorithm, finds out the inadequate and makes an improved algorithm, and evaluates the effect of the improved algorithm combine with the evaluation criteria of search engine. The experiment indicates that the improved algorithm can effectively improve the performance of clustering, so that improving the efficiency of eliminating redundant information.The algorithm, technological route and implementation process of eliminating redundant information researched by this paper have great reference in which enhancing the accuracy and query efficiency of feedback information.
Keywords/Search Tags:Search engine, Redundancy eliminating, Word segmentation, Feature extraction, Clustering
PDF Full Text Request
Related items