Font Size: a A A

Fuzzy Technology And Its Application For Clustering And Regression

Posted on:2018-01-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F LiuFull Text:PDF
GTID:1318330542481841Subject:Light Industry Information Technology
Abstract/Summary:PDF Full Text Request
Machine learning is one of the important research fields in artificial intelligence,and clustering and regression are two important research topics in machine learning,which have been widely applied in the fields of natural language processing,biometrics recognition,computer vision,speech recognition and image recognition etc.However,as the improvement of people's living standard and scientific and technical level,people are demanding more and more on the accuracy of machine learning,such as the accuracy of classification and prediction.At the same time,more and more new application scenarios are discovered,such as historical biased data and large-scale data learning problem,which result in that the traditional machine learning methods can not adapt to the demand of the times.Several techniques for fuzzy clustering and regression and its application are proposed based on the fuzzy set,Bayesian inference,Markov Chain Monte Carlo,single pass clustering framework and other related theories,combined with clustering and regression,for large-scale data sets and biased data sets on different learning scenarios.The main research results are as follows:1)For clustering on large scale data sets,a fast single pass Bayesian fuzzy clustering algorithm is proposed,by introducing Bayesian inference,Markov Chain Monte Carlo(MCMC)random sampling method,single pass clustering framework to the fuzzy clustering.First,a weighted Bayesian fuzzy clustering(WBFC)calculation model and Bayesian inference algorithm are proposed,by introducing the prior distribution of data objects and parameters and weighted mechanism,which mainly carries out weighted clustering on each cluster center representative point.Second,a fast single pass Bayesian fuzzy clustering(SPBFC)algorithm is proposed for large scale data sets by introducing the single pass clustering framework which divides the entire large data set into manageable chunks and carries out weighted Bayesian fuzzy clustering.It can accelerate the convergence of SPBFC by means of blocking and initialization mechanism.Its theoretical analysis about convergence and time complexity is also discussed.A large number of experimental results show the effectiveness of the proposed algorithm.2)In view of the problem that clustering is not accurate enough and easy to fall into the local optimum,a new clustering algorithm,called black hole entropic fuzzy clustering(BHEFC)is presented.First,through the link between clustering and the black hole phenomenon in astrophysics,the minimum black hole entropy criterion(BHE)for fuzzy clustering is proposed.Then,it is revealed that BHE based fuzzy clustering can be realized by using a maximum-a-posteriori(MAP)framework,which in fact indicates that fuzziness and probability can co-jointly work in a collaborative rather than repulsive way.On this basis,an incremental version of IBHEFC for large-scale data sets is proposed by introducing incremental clustering framework.A large number of experimental results show the effectiveness of the proposed algorithm.3)In view of the problem that clustering is sensitive to noise or outlier,a novel Bayesian possibilistic clustering(BPC)method with optimality guarantees is proposed based on probability inference and possibilistic theory.First,the unknown membership degree and cluster center are represented as random variables.Given the specific constraints and uncertainty associated with each random variable,an appropriate probability distribution for each random variable is selected and meanwhile,the Bayesian possibilistic clustering model is proposed.Then unknown parameters of BPC model are work out based on the Bayesian inference and MCMC using a MAP framework.Experimental results on synthetic and real data sets show that the proposed method extends the traditional possibilistic clustering performance,and improves the clustering results.4)In order to solve the problem of the lack of the antecedent parameters' tunability and the consequent parameters' interpretability of fuzzy rules,a novel zero-order TSK fuzzy modeling method called Bayesian zero-order TSK fuzzy system(B-ZTSK-FS)is proposed from the perspective of Bayesian inference.The proposed method constructs zero-order TSK fuzzy system by using MAP framework to maximize the corresponding posteriori probability.First,the unknown antecedent parameters and consequent parameters are represented as random variables.Given the specific constraints and uncertainty associated with each random variable,an appropriate probability distribution for each random variable is selected,so as to construct a joint likelihood model about zero-order TSK fuzzy system.Then,unknown parameters are work out based on the Bayesian inference and MCMC using MAP framework.Finally,experimental results on 28 synthetic and real-world datasets are reported to demonstrate the effectiveness of the proposed method B-ZTSK-FS in the sense of approximation accuracy,interpretability and scalability.5)For the biased data caused by sensitive attributes aggravating the error of regression model,the equal mean-least square(EM-LS)method,which is recently proposed,shows good performance,and it aims to control attribute effect in linear regression based on the error minimization criterion.However,for nonlinear regression modeling,the traditional empirical risk minimization principle used in EM-LS limits the utility of EM-LS.In view of this insufficiency,an equal mean-support vector regression(EM-SVR)based on the margin maximization and the structural risk minimization criterions is proposed by using the constraint condition of equal mean,which has good generalization ability and the characteristics of nonlinear regression.At the same time,the good performance of EM-LS is also inherited.Finally,the experiment proves the effectiveness of the proposed method.6)In view of the issue that EM-SVR cannot control large-scale data attribute effect,a fast nonlinear regression method was proposed based minimum enclosing ball(MEB)theory,by introducing the EMB into the EM-SVR.On the basis of the EM-SVR,the fact that the optimization problem of the EM-SVR is equivalent to a center constrained-minimum enclosing ball(CC-MEB)problem is derived,then a novel fast CC-MEB-based nonlinear regression learning algorithm for attribute effect control on large scale biased dataset,referred to as FEM-CVR,is further proposed by integrating the approximate MEB theory and reducing the original input dataset into the core set.In addition,some fundamental theoretical properties are deeply discussed.Finally,extensive experiments are conducted on synthetic and real datasets and experimental results show that the FEM-CVR can effectively control attribute effect in nonlinear regression model on large scale biased dataset with good generalization ability.
Keywords/Search Tags:Fuzzy clustering, Bayesian inference, Maximum a posteriori, Markov Chain Monte Carlo, Black hole entropy, Fuzzy system, Support vector regression, Minimum enclosing ball
PDF Full Text Request
Related items