Font Size: a A A

Research And Application Of Classifier With Confidence

Posted on:2010-07-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Z WangFull Text:PDF
GTID:1118360275998256Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
There are three challenges to the researchers on the classification in the high-risk areas:1) Can we develop a classification algorithm that outputs predictions coupled with confidence level?2) Are these confidences for the predictions really valid, i.e., could the accuracy rate be guaranteed by the confidence level?3) Could the algorithm give a prediction with a confidence level tailored for each individual instance, in other words, could it provide a prediction corresponding to the confidence level predefined?Faced to these challenges, we have introduced a method which uses the transductive inference and the randomness test of i.i.d. sequences to develop our solution. The recently emerged Conformal Predictor (CP) is an alternative solution which can output prediction with valid confidence. However, there are still certain disadvantages in the framework of CP, such as the inherent computational costliness and the lack of guidance for the design of the example nonconformity measure. We have focused on the improvement and the enhancement of CP, and have then proposed a new Hybrid-Compression Conformal Predictor (HCCP) which performs better in practice.HCCP aims to obtain a good balance between the predictive performance and the computational efficiency. It can maintain a relatively high predictive performance while improving greatly the computational efficiency in dealing with large data sets. HCCP divides the whole training examples into two subsets (called as the training set and the validation set, respectively) and executes the predicting process in two stages. Firstly, it abstracts a compression model M based on the training set; secondly, it designates, for each example in the validation set, the new features which are generated by M and would then be applied by the classical CP algorithm to output the prediction with confidence level. We have proposed a method based on the supervised metric learning to transfer the useful information from the first stage to the second stage. In detail, we have incorporated the adaptive kernel-based distance metric learning method (as in HCCP-KerNN) and the random forest algorithm (as in HCCP-RF), respectively, to realize the supervised metric learning and the example nonconformity measure. The application is simulated on the standard large data set as Tennessee Eastman Process (TEP). The applicability and effectiveness of the proposed HCCP-RF algorithm are illustrated on this online fault detection of large-scale industrial process.To deal with the problem of small-sample classification, we have also put forward the non-partition HCCP-RF algorithm, which disclaims the partition of the whole learning set of examples. The application is simulated on the traditional Chinese chronic gastritis data set, which is a typical small-sample problem. The informative as well as effective predictions of the non-partition HCCP-RF algorithm have been shown in the experiment.Finally, the summary of our work and the future research are presented.
Keywords/Search Tags:classification problem, prediction with confidence, conformal predictor
PDF Full Text Request
Related items