Font Size: a A A

Improvement Of Kmeans Clustering Algorithm And Its Application In Information Retrieval System

Posted on:2017-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y W HanFull Text:PDF
GTID:2348330488964417Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Driven by continuous development and extensive availability of the Internet, information has witnessed remarkable growth. How to accurately and timely retrieve necessary information from massive data emerge as an urgent and challenging problem. Information retrieval technology offers a good approach to obtain information efficiently and effectively. Search engines are the best examples. Sort management of searching outcomes is the most important part in information retrieval process. It directly influences effectiveness of outcomes. Emergence of cluster enables our :effective management of massive information. Except extensive application to information retrieval, cluster is also widely used in multiple text processing.Original Kmeans clustering algorithm presents extensive application. However, operators need to set cluster number manually and its initial cluster center is selected at random. This thesis adopted a new algorithm based on Kmeans clustering algorithm, so as to realize vocabulary clustering during the information retrieval. With regard to problems in original Kmeans clustering algorithm, this thesis introduced how to solve such problems by combining binary tree with original Kmeans clustering algorithm. The algorithm mainly has two parts. One part aims to establish binary tree for clustering objects. Another part aims to prune established binary tree. These two parts are dispensable and closely related to each other. Modified algorithm is applied to information retrieval system according to its feature.Modified algorithm operates on Eclipse platform, and has obtained good clustering effect in the information retrieval system.
Keywords/Search Tags:cluster, Kmeans, information retrieval
PDF Full Text Request
Related items