Font Size: a A A

An Improved ID3 Algorithm And Its Application In Credit Card Default Assessment

Posted on:2020-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2370330599459138Subject:Statistics
Abstract/Summary:PDF Full Text Request
Data mining is a new data analysis technology.Algorithms in data mining can be divided into supervised learning: neural network,support vector machine,decision tree and regression,unsupervised learning: clustering analysis,association rule analysis,data dimensionality reduction: principal component analysis,factor analysis and so on.Common decision tree algorithms include ID3 algorithm,C4.5 algorithm and CART algorithm.ID3 algorithm,as the earliest decision tree algorithm,is widely used.In this paper,ID3 algorithm in decision tree is studied in depth,and the problems of multi-value bias and logarithmic operation of information entropy in this algorithm are solved.Aiming at the multi-value bias problem of ID3 algorithm,this paper proposes a modified ID3 algorithm of information gain function,the correlation coefficient between attributes and categories,the number of the attribute value is introduced into the information gain function.The improved ID3 algorithm reduces the information gain value of attributes with more attributes and little correlation with categories,and solves the problem of multi-value bias.In view of the complex logarithmic operation in information entropy,this paper simplifies the information entropy formula by using Taylor formula,and transforms the logarithmic operation in information entropy formula into non-logarithmic operation.The improvement is verified numerically through four classical data sets on UCI.The ID3 algorithm improves the classification accuracy and reduces the time complexity by simplifying the information entropy formula.Finally,this paper applies the improved ID3 algorithm to the credit card default assessment of banks,and puts forward specific solutions to the problems such as lack of attribute values,discretization of attribute values and attribute selection in its data set.In contrast,the improved ID3 algorithm improves the classification accuracy and reduces the complexity of the algorithm.This example also provides decision support for bank staff.
Keywords/Search Tags:Data mining, Decision tree, ID3 algorithm, Simplified information entropy, Credit card default evaluation
PDF Full Text Request
Related items