Font Size: a A A

Researches On Performance Evaluation Of Classifier Based On AUC

Posted on:2017-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:S JiangFull Text:PDF
GTID:2308330482489805Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Recent years, as the development of computer technology,artificial Intelligence is widely concerned and researched,and the machine learning is also penetrated into many fields. In machine learning,the classification is one of those important researches, the performance of the classification model is often described by the accuracy rate and the recall rate. But they can’t guarantee the accuracy of the performance while the data set is not balance. ROC(the receiver operating characteristic curve) is a new algorithm to evaluate the classification modelno matter how the data distributed, and AUC(area under the curve) is a calculation algorithm for ROC. Due to the uncertainty of the class’ proportion of the data set, the ROC curve becomes more and more important in the performance evaluation of the classification model.AUC and ROC have been widely used in classification performance model. however, ordinary AUC and ROC have some defects: The first one as recognized is that the AUC calculation only use ranking score while ignoring the value of scores; moreover, AUC is not sensitive to misclassification cost which influencesthe performance of classification model a lot. Therefore, this paper is divided into two parts: The first part puts forward cutting point and cutting function to solve the problem of score value, we call it sor ROC and sor AUC. According to the pairs of P-N, we set cutting point and cutting function and finally obtain the curve of sor ROC, sor AUC is the area under the curve of sor ROC; The second part is based on the false positive and false negative, we put forward v AUC in a different view of error cost. The v AUC is seen a uniform thickness 1x1 board with different density, using the v ROC to cut the board, the remaining board’s quality is v AUC.After proposing the new algorithm, this paper derivessome properties of the algorithm and use a simple example to illustrate the advantages between the algorithm and the ordinary AUC. Finally, the UCI data set is used as the experimental data set to compare the new algorithm and the AUC. The experiment is divided into two parts: the first one is among sor AUC and AUC, s AUC, p AUC. Experiments show that sor AUC is more accurate than the AUC, and has the same ability of performance evaluation as s AUC and p AUC.Morover, the sor ROC curve is simpler and reliable than p ROC; The second experiment shows that v AUC fully uses the error cost of the samples, So it’s more accurate and targeted for the assessment on the classification model. In the case of the errorcost, v AUC is better in performance evaluation, and it’s more close to the real results.
Keywords/Search Tags:machine learning, AUC, classification model, error cost, score
PDF Full Text Request
Related items