Font Size: a A A

C5.0Optimization Alaorithm Based On The Cost Matrix And Application In Customer Relationship Management For Hospital

Posted on:2015-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:R Q ZhangFull Text:PDF
GTID:2298330434459102Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Customer relationship management as a new concept in the country in all walks of life have a certain application development, application development, but the medical industry is still very small, with the further development of health care reform, the health care industry Customer Relationship Management is becoming a hot topic. In the classification of data mining applications, decision tree algorithm is the most widely used classification algorithm, algorithm is simple and efficient, high classification accuracy characteristics, so this job classifications hospital patients selected priority C5.0decision tree algorithm and the introduction of a classification model for patients to achieve a cost matrix optimization algorithm C5.0, thereby establishing less costly misjudgment patient classification model.This article studies and analyzes a variety of techniques commonly used in the normal course of data mining, and decision tree classification algorithm used for in-depth analysis, research and analysis based on C5.0decision tree algorithm to optimize the cost matrix and the patient in the hospital classification applications, and are classified according to the practical application of patient hospital cost matrix for data mining model, the extent and Boosting pruning algorithm were analyzed. Introduced in the optimization of the cost matrix analysis costly misjudgment on behalf of the error value COST (high), the general consideration of misjudgment on behalf of the error value COST (middle), the error value misjudgment on behalf of the Low Cost COST (low), and an analysis of misjudgment consideration the value of the determination condition, to give a final comparative analysis COST (high)=3, COST (in)=2, COST (low)=1. Optimization Analysis analyzes the extent in pruning pruning degree selected two reference values:the complexity and accuracy of classification tree models, experimental comparative analysis of the two reference values worthy of the optimal degree of pruning. Boosting algorithm to optimize the analysis carried out for the number of iterations over-fitting algorithm and problem analysis, by comparing the test samples found overfitting problem, so do not use Boosting iterative algorithm in this modeling. On this basis, through the hospital’s inpatient clients for data sampling, pre-processing and modeling data extraction, using the C5.0decision tree algorithm to establish a classification model in hospitalized patients, and the use of test data to test the model analysis. At the same time, the model of customer relationship management systems used in hospitals in hospitalized patients’ classification module, the realization of the hospital CRM system data management module, capable of newly admitted patients hospitalized value classification.The innovation of this paper is to study analyzed the new C5.0decision tree algorithm, the predictive classification will take into account the cost of false positives, false generation of value is given the value conditions, the establishment of a cost matrix to guide modeling, and realized in the model of the overall error rate forecast little change in the situation to do misclassification minimum cost. Boosting discovered iterative algorithm will lead to the problem of over-fitting modeling data in Boosting algorithm analysis.The patient developed a classification model established while having a low degree of risk, good stability, customer relationship management to achieve the hospital’s treatment of the customer value of new patients were classified, but the model in the modeling data and test data classification error rates were8.29%and8.17%, the accuracy of classification can be further improved.
Keywords/Search Tags:Data mining, decision tree, Customer Relationship Management(CRM), Bipartition of hospital customers, C5.0algorithm
PDF Full Text Request
Related items