Font Size: a A A

The Comparison And Optimization Method Of Linear Discriminates

Posted on:2016-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LiFull Text:PDF
GTID:2298330467477376Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Although linear classifier is one of the simplest kinds of classifiers in pattern recognition, it can often achieve good results in many applications. Due to its simplicity, easy to implement and low requirements for computing resources it is widely used.Fisher linear discriminate (FLD) gives the solution of finding weight vectors, but has not given a clear explanation about the choice of thresholds that finally determine the location of the hyper-plane. Commonly used thresholds tend to bias to certain kind of samples in imbalanced problems, causing the reducing in classification performance. This paper shows that the main factor that influences the FLD is the imbalance of sample distribution area and put forward some experienced thresholds, considering the imbalanced problems. Each threshold may achieve the best results under specific distribution or specific evaluation criterion. By studying the performance of different thresholds under different evaluation criterion, we get the application scope of each threshold.Pseudo-inverse linear discriminate (PILD) is another kind of widely used linear classifier. This paper proves that the commonly used assumption about the expected output in pseudo inverse method is unreasonable and that the FLD and PILD are not necessarily equivalent even under certain condition and studies the influence of the input data on the final results.Compared with decision trees, neural network and other complex classifiers, linear classifiers are less likely to over fit because of its simple assumption that the samples can be roughly divided into two groups by a hyper plane. This paper argues that the performance of linear classifiers such as FLDs and pseudo inverse can also be improved when combined with Adaboost algorithm. We analyze the characteristic of Adaboost and use it to improve the performance of FLD and PILD.This thesis studies the effect of feature representation on the performance of the classifier and suggests that dimension reduction should be performed instead of adding tiny disturbance when matrix is irreversible and proposes a binary-decimal coding method, which improves classifier performance under the premise of keeping the internal structure of the original data. Experiments show that by choosing the right threshold, taking the proposed feature representation method and combined with Adaboost algorithm the performance of FLD and PILD is improved.
Keywords/Search Tags:FLD, PILD, Adaboost, Thresholds, Imbalanced datasets
PDF Full Text Request
Related items