Font Size: a A A

Research And Application Of Binary Classification Based On Score Functio

Posted on:2024-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:J J LiFull Text:PDF
GTID:2530307106978269Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the continuous development of scientific information technology and artificial intelligence and the improvement of massive data collection capabilities,the processing of big data has become a top priority,especially the classification problem has a very wide range of applications in the fields of social science,economic management and medicine,so it is of great significance to efficiently classify and predict data in various fields and provide valuable statistical information.Based on the idea that the greater the probability of positive judgment and the smaller the probability of false positive judgment in the classification problem,this paper proposes a new classification method,and on this basis,it is extended to other fields to make the classification more accurate and efficient.The specific research content is as follows:The first chapter systematically introduces the research background and significance of this paper,introduces the development of various classification methods at home and abroad,and the research status of binary classification methods in the field of variable selection and semi-supervision,and expounds the main research content and innovation points of this paper.The second chapter proposes two probabilistic classification methods based on score functions,namely the classification method of maximizing the likelihood ratio formal statistic(MLR)and the classification method of maximizing the form of Kullback-Leibler divergence statistic(MKL),and theoretically proves the consistency of the proposed MKL estimate.At the same time,through extensive simulation studies and an example of a heart failure dataset,it is proved that the MKL method has better classification effect than MLR method,and has advantages over some existing classification methods in terms of prediction ability,computational complexity and actual interpretability.Chapter 3 aims at the special response variables containing two target variables and the heterogeneity of the data,combined with the alternating direction multiplier method(ADMM algorithm)to perform subgroup analysis of the data,and then MKL classification of the new data,through multiple numerical simulations and examples of cervical cancer risk factors,it is shown that the classification method after subgroup analysis has better classification performance.Chapter 4 evaluates the predictive performance of the MKL classification model based on the semi-supervised setting,and proves that the semi-supervised estimator is more effective than the supervised estimator through a large number of numerical simulations and an example of obesity estimation.Chapter 5 summarizes the research content of this paper in detail,and points out some shortcomings in the article and further consideration of these shortcomings.
Keywords/Search Tags:classification, scoring function, ADMM algorithm, semi-supervised
PDF Full Text Request
Related items