Chronic diseases have always been one of the major threats affecting residents’ health in China.Despite the extremely high medical costs,it has long treatment and may be less likely to be cured,while comparing with other diseases.Generally,chronic diseases is caused by the accumulation of long-term unhealthy habits.Among these diseases,chronic kidney diseases covers a large part of it,therefore,for chronic diseases,getting medical treatment at the early onset will greatly reduce the prevalence.Presently,in various methods in prediction solutions,the decision tree algorithm is one of them with higher accuracy and more application.This paper mainly proposed a improvement of the existing decision tree model,and based on this,a chronic kidney disease prediction system is designed to help doctors to make better decisions.Meanwhile,with an empirical analysis,the improved decision model is proved to has higher accuracy comparing to the traditional decision model.The main work in this paper includes analyzing the shortcomings of the decision tree algorithm,C4.5,in discretization continuous attributes,when searching the best segmentation point,the training set must be traversed several times and sorted,and there are a large amount of calculations need to be performed.As for the training set where there are numerous continuous attributes and lots of records,will leading to the increase of run time in the model generation,which will result in low efficiency at the process.Thus,this paper proposed a class-attribute based weighted mean method.After the continuous attributes are divided by the possible values in the target class,the weighted mean of each subset is calculated to be taken as the best segmentation point of the continuous attribute.This method has less computational complexity and lower time complexity,which can obviously improves the efficiency of discretization.On the other hand,this paper also optimizes the original pruning method in the C4.5 algorithm,apply a misjudgment evaluation to prune the decision model,and further improves the classification accuracy.To verify the improvement,the paper chosen the chronic_kidney_disease dataset in the UCI standard database to compare the accuracy before and after optimization.And designed a chronic kidney disease prediction System that can generate a decision model with the optimized algorithm to predict the patient’s diseases by extract information records from the database. |