Tree Classifier In Variant Space

Posted on:2009-10-13

Degree:Master

Type:Thesis

Country:China

Candidate:P He

Full Text:PDF

GTID:2178360242993652

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Classification is an important research area in machine learning. Among the various existing classifiers, one of the most useful and effective tools is decision tree. However, traditional decision tree algorithms usually have their runtime performance sacrificed for the consideration of the limited main memory at that time, and their single-variant test on each tree node can only produce axis-parallel hyper-rectangle decision boundaries, which have weak generalization ability on complex datasets. In this thesis, we focus on the above limitations of traditional decision trees and provide the corresponding solutions as follows.First, we propose a fast main-memory implementation of C4.5 algorithm, named Fast C4.5. Fast C4.5 uses a preprocessing to extract the order information of all the tuples on every attribute, then optimizes the continuous attribute splits evaluations with the indirect bucket-sort combined with the bit-parallel technique, and finally it accelerates the search for cutoff using the binary-search within the narrowest range. Besides, some structural integration is also employed in Fast C4.5 to reduce redundancy and improve efficiency. It is shown that Fast C4.5 reduces part of the time complexity of C4.5 and accelerates the tree construction process greatly in experiments.Second, we present a Latent Attribute Space Tree (LAST) classifier framework, which transforms the original attribute space into the more data-separable or more classifier-oriented latent attribute space, so that the traditional tree classifiers' hyper-rectangle decision boundary can be extended and its generalization ability can be improved. Under the LAST framework, we design two Singular-vector-space Oblique Decision Tree (SODT) algorithms. SODT constructs singular vector space for global and/or local data, builds traditional decision tree or tree nodes in the new space, and obtains the approximately optimal oblique decision tree of the original attribute space indirectly as its result. Experimental results show that, compared with the traditional univariant decision tree and other oblique decision tree algorithms, SODT algorithms give higher classification accuracy, more stable decision tree size, comparable tree-construction time as the univariant decision tree, and much less than that of other oblique decision trees.Third, we propose a Nonlinear Manifold Mapping Classifier (NMMC) framework, which integrates manifold mapping, classifier and test data extension three variable components, to provide a unified framework for nonlinear classifier design. Under the NMMC framework, we propose a Spectral Space Decision Tree (SSDT) algorithm. SSDT implements manifold mapping of NMMC as spectral space transformation on the laplacian matrix to simplify the relationship of the new conditional attributes and the class attribute, and selects decision tree as the classifier for simplicity. By integrating the class information of training data into the original unsupervised spectral space transformation, we also present a spectral space decision tree algorithm based on supervised manifold mapping, which maps the different classes of data into new space with better separation. We prove in experiements that SSDT have great advantage over the traditional decision tree algorithms in respect of classification accuracy, decision tree size and robustness.

Keywords/Search Tags:

machine learning, classification, decision tree, C4.5, singular vector space, manifold mapping, spectral method

PDF Full Text Request

Related items

1	A Study On Classifiers In High-dimensional Space Based On Manifold Learning
2	Spectral Learning And Clustering And Its Application
3	Study On The Method Of Non-linear Classification Base On The Contraction Of The Closed Convex Hull
4	Pig Posture Classification And Abnormal Behavior Analysis Based On Decision Tree Support Vector Machine
5	Study On Multi-class Classification Method Of Support Vector Machine Based On Decision Tree
6	Research On Network Traffic Classification Based On Machine Learning
7	Research On Manifold Learning For Classification
8	Research On Classification Method Of Random Support Vector Machine And Its Application
9	Research On The Text Classification Method Based On Extreme Machine Learning
10	Study On Color Spectral Characteristics Based On Manifold Learning