Font Size: a A A

Study On Clustering And Classification With Applications

Posted on:2011-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:W H WangFull Text:PDF
GTID:2178360302474621Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering and classification are two important issues in the field of artificial intelligence. Actually, people desire to obtain their likely information as soon as possible from Internet. However, the large scale information on the web with different structures has become a barrier for users to access internet resource. Therefore, clustering and classification are hot topics recently and studied to resolve such challenge.In this thesis, we first summarize existing clustering and classification algorithms and elaborate on several representative clustering and classification methods. After that, we propose two approaches to improve traditional clustering and classification. In order to improve clustering performance for large scale data, we present a parallel clustering method in this paper. This parallel clustering method introduces affinity propagation clustering to MapReduce framework. The proposed parallel clustering approach largely improves clustering performance compared with traditional affinity propagation method, and is applicable to larger scale data clustering. In order to improve the accuracy of classification, we propose a new semi-supervised classification approach based on graph model. This approach not only utilizes the discriminative information from labeled instances, but also exploits the geometry of the distribution of unlabeled data. Extensive experiments demonstrate the superiority of our proposed classification method in comparison with classical classification algorithms including SVM and LDA.
Keywords/Search Tags:Clustering, Classification, MapReduce, Graph Model, Semi-supervised, Support Vector Machine, Linear Discriminant Analysis
PDF Full Text Request
Related items