Font Size: a A A

Study On Medical Data Classification Based On Fuzzy Decision Tree

Posted on:2019-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2394330545952249Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the widespread use of hospital information systems and medical digital devices,information on patient cases,tests,diagnosis,and treatment in medical databases is becoming accumulating.The need to effectively discover the knowledge contained in medical data is increasingly urgent.With the help of data mining technology,these valuable medical data resources are analyzed and processed to find valuable information,which can provide scientific decisions for the diagnosis and treatment of diseases.Fuzzy decision tree is presented by combining fuzzy theory with decision tree which can deal with the ambiguity and uncertainty existing in medical data,enhance the applicability of decision tree algorithm and classify medical data more accurately and effectively.The main work of the thesis is shown as follow:(1)The two algorithms of fuzzy decision tree—the fuzzy ID3 algorithm and the min-ambiguity algorithm are studied in detail.The induction of fuzzy decision tree and the extraction of rules are completed on a fuzzy dataset,and the fuzzy rules are used to classify new instances.The differences between crisp decision tree and fuzzy decision tree are summarized.(2)For continuous value attributes in medical data sets,Kohonen feature mapping algorithm and triangular membership functions are used to complete fuzzification.Thus continuous value attributes can be divided smoothly and the characteristics of attributes in the dataset can be described naturally and reasonably.(3)The fuzzy ID3 algorithm and the min-ambiguity algorithm are implemented by MATLAB.Combining the C4.5 and CART algorithms implemented,classification models are constructed on publicly available medical data sets.The differences of four decision tree algorithms in the classification accuracy and the number of rules are compared.It is found that fuzzy ID3 algorithm has higher accuracy rate and less rules on each medical dataset,reflecting the advantages of the fuzzy decision tree algorithm in processing the numeric attributes of medical data and ambiguity existing in the medical data.(4)For fuzzy decision tree algorithm,some important parameters are usually set according to people's common or experts' opinion.It is proposed that the improved particle swarm optimization algorithm is used to intelligently search the parameters combination to improve the performance of the fuzzy decision tree.The fitness function synthesizes training accuracy,test accuracy,generalization ability,and scale of the tree.Classification models are constructed on the publicly available medical data sets by usingoptimized fuzzy decision tree algorithm,which verifies the necessity and effectiveness of the method.The method can not only produce more accurate prediction outcome,but also provide the reason of prediction in the form of decision tree.
Keywords/Search Tags:Fuzzy decision tree, Improved particle swarm optimization, Data mining, Medical data classification
PDF Full Text Request
Related items