Font Size: a A A

Unsupervised/supervised Modeling Research For Different Data Environments

Posted on:2019-06-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P ZhangFull Text:PDF
GTID:1368330572959817Subject:Light Industry Information Technology
Abstract/Summary:PDF Full Text Request
Currently,machine learning has become very hot with the development of artificial intelligence and has been widely used in medical,transportation,finance and other fields.However,as the application scenarios of machine learning continue to expand,the size and form of data become complex.Such complex data environments including large-scale and multi-view ones can bring huge challenges to traditional machine learning methods.Among unsupervised and supervised learning methods,the challenges of traditional clustering analysis and TSK fuzzy systems for complex data environments can be stated as follows: 1)for large-scale data,pattern information may not be obtained from exemplar-based clustering and TSK fuzzy systems within a tolerant time.2)for high-dimensional data,the number of fuzzy rules increases exponentially with the number of features.As a result,the interpretability of TSK fuzzy systems will be severely degraded by the increase in the number of features.3)As for multi-view data,direct integration of partition results of all views obtained by traditional single-view clustering methods does not improve and even deteriorate the clustering performance since that it does not consider the inner relationship across views.To overcome the above problems,for unsupervised-learning and supervised-learning,we will discuss the modeling techniques in different data environments.The main results are reported in the following aspects:(1)A fast self-adaptive exemplar-based clustering approach ESFSAC is proposed.In this novel approach,firstly,all samples are ranked ordered by their exemplar scores descendingly,and stored in a set called exemplar candidate set.Secondly,exemplars in the candidate set are selected one by one and their labels are propagated to their neighbors in the reduced set.Thirdly,with the same strategy,the unlabeled samples gain their labels from the samples in the reduced set.To speed up this process,a sampling approach is introduced.(2)An exemplar-based clustering approach ETLMC governed by the gravity enrichment effect is proposed in which “exemplar masses” is hired for evaluating the possibility of an object to be an exemplar.To realize exemplar masses transmission between objects,with the framework of Bayesian,transmission learning and hence transmission learning machine are proposed.The transmission of exemplar masses can be considered as the enrichment between objects which is very similar to the gravity enrichment existing in Newton's law of gravity.Therefore,the gravity enrichment effect is proposed to govern the exemplar masses transfer between objects during the clustering process.ETLMC is distinctive in its fast exemplar finding for large-scale data with arbitrary shapes in terms of its global analytical solution,which can be fast computed by avoiding performing a matrix inversion.(3)A collaborative learning mechanism and hence a multi-view To achieve good clustering performance for multi-view data,we propose a multi-view & multi-exemplar fuzzy clustering approach MFCMddI are proposed.With respect to within-view partition,to capture more detailed information of cluster structures,a multi-exemplar representative strategy is adopted;as for cross-view collaborative learning,we assume that an exemplar of a cluster in one view is also an exemplar of that cluster in another view.The invariant constraint in which the invariant an arbitrary exemplar across each pair-wise view is guaranteed by maximizing the product of the corresponding prototype weights in two views.(4)A common and special knowledge-driven TSK fuzzy system CSK-TSK-FS is proposed,in which the parameters corresponding to each feature in then-parts always keep invariant and the invariant parameters are referred to as common knowledge.As to its modeling,except the gradient descent technology and other existing training algorithms,we can obtain a trained CSK-TSK-FS from a trained GMM or a trained FLNN because the proposed fuzzy system CSK-TSK-FS is mathematically equivalent to a special GMM and an FLNN.CSK-TSK-FS has three characteristics: 1)with the classical centroid defuzzification method,the involved common knowledge can be separated from fuzzy rules such that the interpretability of CSK-TSK-FS can be enhanced;2)it can be trained quickly by the proposed LLM-based training algorithm;3)the equivalence relationships among CSK-TSK-FS,GMM and FLNN allow them to share some commonality in training such that the proposed LLM-based training algorithm provides a novel fast training tool for training GMM and FLNN.(5)A highly interpretable deep TSK fuzzy classifier HID-TSK-FC based on the generalized stacked principle is proposed.HID-TSK-FC has two characteristics: 1)one is a stacked hierarchical structure of component TSK fuzzy classifiers for high accuracy,and 2)the other is the use of interpretable linguistic rules with the same set of linguistic labels for all inputs.High interpretability is achieved at each layer by using the same set of linguistic values for all inputs including the outputs from the previous layers in the stacked hierarchical structure.We show that a linguistic rule with the outputs from the previous layers as its inputs is equivalent to a fuzzy rule with a nonlinear consequent or a linear consequent with a certainty factor.We also show that HID-TSK-FC is mathematically equivalent to a novel TSK fuzzy classifier with shared interpretable linguistic fuzzy rules.
Keywords/Search Tags:exemplar-based clustering, multi-view clustering, TSK fuzzy systems, Common knowledge, shared linguistic fuzzy rules
PDF Full Text Request
Related items