Font Size: a A A

Study On Search Results Clustering Algorithm Based On Multi-Core Technology

Posted on:2013-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:S LinFull Text:PDF
GTID:2248330374498114Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Integrating clustering technology and search engine, web clustering engines cluster search results and provide theme clusters to users. Users sift topics of interest, and then browse the related contents whether are valuable or not, that helps users reduce the search burdens and is a hotspot of search engine. There are two factors contribute to the user experience of web clustering engines:one is the presentation of clusters; anther is the efficiency of responding to user request. Those are what we will discuss in this paper.First, web clustering engines display groups with a folder tree or other visual views. The arrangement of the clusters satisfies the expectations of users only by objective evaluation of the importance. According to Lingo algorithm, an improved method of calculating cluster score was proposed, not only the score of cluster label and the number of documents within each cluster were considered, but also used documents’ranking before clustering and document score after document had been assigned to cluster. The experiment results showed that the improved algorithm had objectively reflected the relevance and authority of clusters.Because clustering algorithms are time-consuming, it is necessary that to improve the algorithm efficiency to meet the time tolerance of online clustering. With the rapid development and widely used of multi-core processor, for greater performance, we optimized the improved Lingo algorithm by program parallelization, which is an available method to take advantage of multi-core resources, using multi-threading. The experiments proved that the parallel program has a satisfied performance.
Keywords/Search Tags:search results clustering, web clustering engine, Lingo algorithm, cluster score, multi-core processor
PDF Full Text Request
Related items