Font Size: a A A

Research On Local Classification Methods Based On Bayesian Decision Theory And The Applications

Posted on:2017-11-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:C S MaoFull Text:PDF
GTID:1318330533451481Subject:computer science and Technology
Abstract/Summary:PDF Full Text Request
Classification is an important research area in machine learning and data mining.In classification problems,a collection of samples with known labels is usually created as the training set,and the classification of each new sample,i.e.predicting the label of a label unknown sample,is achieved using the evidence of samples in the training set.Local learning is an important method of machine learning.It learns from a certain subset of the training set,and builds a proprietary local learning model for the corresponding local region.Local classification takes the idea of local learning to solve classification problems.Because the local classification model is built based on samples in the local region that are very relevant to the query sample(s),it can reflect the information of the query sample(s)and can achieve a good classification performance.K-Nearest Neighbors classification(kNN),as a specific local classification algorithm,has been extensively studied and applied in machine learning,pattern recognition and data mining owing to its simplicity,comprehension and easy implementation.The current researches on local classification are mainly based on kNN and lack systematic research.We have made thorough research and discussions on local classification based on Bayesian decision theory and local probabilistic models as the core idea,and propose a general form of local classification,based on which we can achieve the output of membership probability for local classification.In addition,this thesis does a series of in-depth research and attempt on the two key problems for local classification,i.e.the local region selection and the corresponding local model selection.We theoretically analyze the relationship between local region selection and the corresponding local model selection for local classification,which will provide guidance for the specialization of local classification.Finally,the local classification method is applied to an electroencephalogram(EEG)-based individual identification system,and achieve a competitive recognition result.The main work of this thesis is as follows:1.The local model selection problem in local classification is specialized to the neighborhood information organization problem in kNN classification.We propose a local distribution-based kNN(LD-kNN)classification algorithm based on Bayesian theory for neighborhood information organization in kNN.LD-kNN constructs a neighborhood for a query sample and estimates the local distribution of the neighborhood by the training samples in the corresponding neighborhood.Then,the estimated local distribution information is used to calculate the membership probability of the query sample belonging to each class through Bayes' theorem,and the query sample is classified to the class with the greatest membership probability.LD-kNN classification comprehensively takes into account the quantity information,distance information and the sample position information contained in the neighborhood through local distribution information,which is an improvement and refinement of the existing kNN method.We have conducted a series of experiments to study the performance characteristics of the proposed LD-kNN method using a number of real datasets and synthetic datasets;the experimental results demonstrate the dimensional scalability,efficiency,effectiveness and robustness of LD-kNN compared to some other state-of-the-art classifiers.2.In the estimation of local distribution,we redefine the local probability distribution and propose a Local Probabilistic Model(LPM)-based probability Density Estimation(LPM-DE)method.Due to the complexity of true probability distribution in practice,a common parametric probabilistic models usually difficult to model the true probability distribution effectively,while a non-parametric probabilistic model usually needs much more training samples with lower efficiency.LPM-DE is a compromise between a parametric probabilistic model and a nonparametric probability model;it globally estimates a nonparametric model and locally estimates a parametric model;through selecting an appropriate local region and the corresponding local probabilistic model,LPM-DE can overcome the defects of both parametric models and nonparametric models to some extent,and can estimate the global probability density effectively.A series of experiments on the simulated data set demonstrate the effectiveness of LPM-DE.3.Based on Bayesian decision theory,we use local probability models to solve the probability estimation problem in Bayesian classification and propose an LPM-based Bayesian classification method.LPM-BC is a general form of local classification methods.By selecting different local regions and the corresponding local probability models,LPM-BC can be specialized into various local classification algorithms;the traditional kNN classification and the proposed LD-kNN can be a specialization of LPM-BC.LPM-BC makes local classification probabilistic and can output the membership probability of the query sample to each class for the subsequent probabilistic inference,which is a significant advantage over classifiers that only output class labels.In addition,we analyze and discuss the local region selection and local probabilistic model selection for LPM-BC,and summarize the relationship between local region selection and local probabilistic model selection in local classification.Experimental results on a series of simulated and real datasets show that the local classification method LPM-BC has a good classification performance with an appropriate local region and the corresponding local probabilistic model.4.The local classification method is applied into the field of electroencephalogram(EEG)-based biometric.We have designed and implemented an EEG-based pervasive individual identification system.The system receives and analyzes the EEG signal of the subject in real time,extracts identity-related features from preprocessed EEG signal,and then employs local classification methods to establish an appropriate local probabilistic model according to the extracted features to identify the subjects.In the experiments,we implement a local classification algorithm based on local probabilistic centers(LPC),and use LPC to classify each session of EEG signals to a certain subject.Compared with other state-of-the-art classification algorithms,local classification method LPC can achieve a better recognition performance.In this thesis,we make in-depth research on local classification,and propose a general method of local classification based on Bayesian decision theory.This method can output the classification results in probability form.By selecting different parameters,this method can be specialized to most of the existing classification algorithms,and has an important theoretical value and a wide application background.
Keywords/Search Tags:local classification, local probabilistic model, Bayesian decision, k-nearest neighbors, neighborhood, machine learning, data mining
PDF Full Text Request
Related items