Font Size: a A A

Data mining in databases: An extended decision tree approach and methodology in database environment

Posted on:2001-05-12Degree:Ph.DType:Thesis
University:University of Alberta (Canada)Candidate:Iliskovic, Sinisa AFull Text:PDF
GTID:2468390014454168Subject:Computer Science
Abstract/Summary:
The work presented in this Ph.D. thesis is in the field of Data Mining and Knowledge Discovery. The research objective was the development of a new approach towards analysis and design of decision trees in a framework of databases. The decision trees are implemented in the database structure for purpose of analysis and better usage of the data sets that are constituent parts of the database. Various novel data mining ideas are realized in this approach including merging of different data sets, handling of missing values and database approach for pruning of decision trees. Other ideas and implementation findings are in the domain of further manipulation in the process of obtaining the decision tree structures and the rules induced from them by using the database features. Such findings are presented through the use of multiattribute class identifiers, terminal node proximity algorithm, predictive capabilities of obtained rules and structures and alternate branching of decision trees. Last, mathematical design and algorithm implementation of the three new different entropy functions used in this research, namely sinusoidal, parabolic and triangular are successfully implemented and given. The implementation of a decision tree is completed using database SQL (Structured Query Language) languages. This selection is motivated by their intended functionality when operating in the database environment. Research was performed on databases because only database structure and the data in it can provide real output and feedback about the usefulness and accuracy of the final results.; In the course of this research, extensive theoretical foundation with novel ideas mentioned above was created, with the algorithm implemented in the practical software solution. Large experiment and analysis volume was generated to prove and compare theoretical ideas and practical implementation.; This research provides for the unique combination of a theoretical and scientific approach with inherent database features and capabilities. It can be used as a foundation for further data mining research in databases by the use of decision trees and algorithms developed during the course of this thesis.
Keywords/Search Tags:Data mining, Decision, Approach
Related items