Font Size: a A A

A Study On Supervised(Transfer Leanring) Clustering For Large Scale Data

Posted on:2018-03-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:A G ChenFull Text:PDF
GTID:1318330518475309Subject:Light Industry Information Technology
Abstract/Summary:PDF Full Text Request
Artificial intelligence has made great progress after more than 60 years of development.As one of the most active fields of artificial intelligence,machine learning has been developed rapidly.Clustering is an effective method and tool of data analysis and has been widely used in the academic and industrial fields.However,with the wide application of computer technologies and the continuous development of science technologies,new problems and challenges continue to emerge,including clustering in transfer and large-scale data scenarios.In this paper,we focus on the clustering problem in the above two scenarios.In the study of traditional clustering algorithm,we found that clustering performance is often not ideal or sometimes even unable to run the algorithm on transfer and large-scale data scenarios.The common challenges are: 1)Due to no accumulation of data at the beginning of the establishment of the industry or insufficient sample data collected,or due to pollution data collected by instability equipment,the performance of clustering is often instability and even failure if the traditional clustering algorithm is used directly.2)In large-scale data scenarios,due to limited memory for processing machines which cannot load all the data to be processed at once time,traditional clustering algorithms can not directly used to analyse the data.In order to solve the problems in traditional clustering algorithm is applied to the two emerging application scenarios,we will reconstruct the classical fuzzy clustering algorithms to adapt to the new application scenarios.The main contents are as follows:(1)From Chapter two to Chapter four,we focus on the restruction and application of fuzzy clustering algorithms in transfer scenarios.In Chapter two and Chapter three,we discuss the reconstruction of the classical fuzzy clustering algorithms.Chapter four discusses the application of knowledge transfer in image segmentation.Specifically,Chapter two proposes a new PPKTFCM clustering algorithm,which is on the basis of fuzzy C means(FCM)clustering algorithm,by modifying the objective function of the classifical FCM algorithm.The new PPKTFCM algorithm not only considers two rules at the same time,which are the rule of minimum distance summation between the samples and the history cluster centers and the minimum rule of membership changes,but also possesses the function of knowledge transfer.Eventually,the clustering performance of the PPKTFCM algorithm is improved.Chapter three generates a novel fuzzy clustering algorithm MEKTFCA which possesses knowledge transfer.The MEKTFCA algorithm is the maximum entropy clustering algorithm(MECA)based on the addition of two new rules: the constrained rule of membership degrees and the minimum change rule of clustering centers.Because of the application of knowledge transfer,The MEKTFCA algorithm improves the clustering performance under the condition of insufficient samples and samples with noises.Chapter four proposes a new fuzzy clustering algorithm for image segmentation.The new algorithm generates a new objective function by modifying the classical FCM algorithm.The new objective function has the ability to absorb the spatial neighborhood knowledge by adding a regularization term.Since the regularization term is added,the robustness of the new algorithm for image segmentation is improved.(2)From Chapter five to Chapter six,we focus on the reconstruction of fuzzy clustering algorithm in large-scale data application scenarios.In Chapter five,we put forward a novel incremental fuzzy clustering algorithm MMFCA by refering to the history based online fuzzy C medoids algorithm(HOFCMD)and the online fuzzy C medoids algorithm(OFCMD),but the multiple medoids mechanism is used in MMFCA algorithm to overcome the shortcomings of using only a single medoid to represent a cluster in HOFCMD and OFCMD algorithms.The multiple medoids mechanism and the constraint relationships between the cluster medoids in MMFCA algorithm ultimately contribute to better clustering performance.Inspired by the idea of OFCMD and FC-QR algorithm,we presents a novel fuzzy clustering LS-FMMdC algorithm based on multiple medoids for large-scale data in Chapter six,which contains three optimization mechanisms,namely weighted representation,two regularizations and pairwise constraines.The multiple optimization mechanisms and multiple medoids contributes to the improvement of the clustering performance of the LS-FMMdC algorithm.It should be noted that Chapter five and Chapter six focuse on the clustering problem in large-scale data application scenarios.The knowledge transfer is used in the process of dealing with data chunks of the large-scale data.Therefore,the two chapters are comprehensive study of large-scale data scenarios and transfer scenarios.
Keywords/Search Tags:Clustering algorithm, Fuzzy C-means, Maximum entropy, Knowledge transfer, Large-scale data, Incremental clustering, Multiple medoids
PDF Full Text Request
Related items