Font Size: a A A

Research On Ensemble And Imbalanced Based Supervised/Unsupervised Learning Methods And Application

Posted on:2021-01-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:1368330647961793Subject:Light Industry Information Technology
Abstract/Summary:PDF Full Text Request
Nowadays,Artificial Intelligence(AI)is dominant.Machine learning,which is an important composition of AI,has made tremendous development especially in the field of supervised/unsupervised learning.However,with the rapid development of computer technology and the continuous update of social needs,more and more new complex application scenarios have emerged,such as single classifier cannot be applied to all scenarios,the amount of data of different classes is seriously imbalanced,and the clustering shape is complex,and so on.These put forward higher requirements for the supervised/unsupervised algorithms.Many traditional supervised/unsupervised algorithms encounter unprecedented challenges when facing these complex data scenarios:1)In the supervised learning area,the application scenarios of the single classifier are limited,while the existing ensemble learning methods do not fully consider the diversity and cannot obtain satisfactory classification results;the data of different classes are often seriously imbalanced in practical applications,while traditional classifiers cannot adapt to obtain ideal results.2)In the unsupervised learning area,the clustering center has no practical significance,the number of clusters can not be set in advance and the clustering shape is irregular.In order to overcome the above challenges,we mainly focuses on ensemble learning and imbalanced learning in the field of supervised learning,as well as clustering algorithms in the field of unsupervised learning,and discusses the improvement and application of related algorithms.The main research results are reported in the following aspects:1)An ensemble support vector machine algorithm based on negative agreement learning(ESVM-NAL)is proposed.The algorithm takes negative agreement learning as an explicit diversity measurement method.It trains the whole ensemble model and its sub-classifiers in a joint way rather than an independent way,which ensures accuracy and diversity.Theoretical derivation reveals the formulation of the ensemble of SVMs as one single SVM;thus,abundant advances in the training of SVM can be conveniently applied to the proposed ensemble learning of SVMs and there is no need to design special optimization techniques for the involved ensemble learning.2)A Laplacian least learning machine with dynamic updating(L~2MM-DU)is proposed.Firstly,the relationship between samples is added on the basis of traditional cost sensitive algorithms by using the Laplacian matrix to design the Laplacian least learning machine.It can be applied to imbalanced classification scenarios while inheriting the fast learning and good generalization capability of the least learning machine.Then,the incremental learning method is used to improve the Laplacian least learning machine.It realizes the dynamic update model to find the optimal number of hidden nodes without any re-calculation of the inverse matrix,which ensure the performance and shorten the training time.3)A density-based fuzzy exemplar clustering algorithm(DFEC)is proposed,which combines the advantages of exemplar clustering,density clustering and fuzzy clustering.The algorithm is self-adaptive and interpretable.It does not need to specify the number of clusters in advance,and can automatically determine the real cluster center points.In the clustering process,DFEC firstly estimates the probability of each sample becoming a candidate cluster center point by means of the sample density,then uses fuzzy method to determine the cluster center point and obtain the soft partition of the sample,finally achieves the clustering of samples effectively.Experiments on synthetic and UCI datasets show that the proposed algorithm has better clustering performance and wide applicability than other clustering algorithms.4)A region of interest marked method for image by low and middle level is proposed to study the application of clustering algorithms in image field.This method combines the low and middle levels information to ensure the both information complement each other and obtains reliable results.Firstly,middle-level coarse saliency map is get by using the boosting Harris to make a convex hall and superpixels clustered by Graph-based Relaxed Clustering.Then the low-level saliency map is get by weighting different Gaussian filters.The final saliency map is combinated by middle-lever saliency map and low-level saliency map.Experiments on the public datasets from Microsoft Research Asia show that the proposed method can effectively eliminate background noise and exactly make the saliency regions high light.
Keywords/Search Tags:Supervised/Unsupervised learning, Ensemble learning, Imbalanced classification, Clustering, Region of interest marked for image
PDF Full Text Request
Related items