Font Size: a A A

Self Adaptive Cost-sensitive Decision Tree Learning Methods

Posted on:2017-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:X J LiFull Text:PDF
GTID:2348330485456512Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Cost-sensitive decision tree is an important research topic in data mining for classification issues.It has received extensive attention of scholars both at home and abroad in recent years.Recently,more and more algorithms have been proposed for classification issue.With the rapid development of internet technologies,modern applications usually produce large-scale data.It is technically difficult to handle the data stream classification by traditional classification methods,because it is characterized by having the high-dimensional feature space.It is necessary to improve the traditional classification algorithms.Therefore,an attribute selection algorithm and three kinds of self adaptive cost-sensitive learning algorithms based on C4.5 are proposed in this dissertation to solve research problems of the classification.The main research works are described as follows:First of all,we propose an exponent weighted algorithm for minimal cost attribute selection.Recently,a backtracking algorithm has been developed by Fan Min to tackle this problem.Unfortunately,the efficiency of the algorithm for large data sets is often unacceptable.Therefore,we proposed a heuristic algorithm to improve the effectiveness.Compared with the backtracking algorithm,our algorithm significantly increases efficiency without being influenced by the misclassification cost setting.Secondly,a cost-sensitive algorithm with adaptive selecting the cut point mechanism is set up.It selects the cut point adaptive to build a classifier rather than calculates each possible cut point of an attribute.It improves the efficiency of evaluating numeric attributes for cut point selection significantly.The effectiveness of the proposed algorithm is demonstrated in our experiments.Thirdly,we propose a cost-sensitive algorithm with adaptive deleteing attribute mechanism.It removes some redundant attributes in the process of selecting node according to the values of the heuristic function in the process of selecting attribute in nodes.Compared with CS-C4.5 and CS-Gain Ratio algorithms,the proposed algorithm significantly increasesefficiency.Finally,we propose a new cost-sensitive decision tree algorithm with adaptive probabilistic pruning mechanism.The pruning probability is related to the change of costs around pruning.Experimental results show the efficiency and effectiveness of the probabilistic pruning mechanism for cost-sensitive decision tree.
Keywords/Search Tags:cost-sensitive learning decision tree, rough sets, attribute selection, self adaptive, C4.5 algorithm
PDF Full Text Request
Related items