Font Size: a A A

Graph Model Of Knowledge Discovery Method

Posted on:2002-08-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:G LiFull Text:PDF
GTID:1118360032451222Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Graphical model is developed by the integration of probability theory with graph theory. It provides a natural tool for dealing with two problems, uncertainty and complexity, in applied mathematics and engineering. In particular, it plays an increasingly important role in the field of Data Mining and knowledge Discovery. This paper presents an in-depth exploration of both theoretical and practical issues related to Graphical Model for KDD, including discretization, structure learning, parameter learning and model explanation.Firstly, an unsupervised algorithm and a supervised algorithm are proposed to discretize the original data set. The unsupervised algorithm is based on mixture probabilistic models, it can automatically divide the range of specified attribute into intervals without prior knowledge or referencing attributes. A mixture probabilistic model with all the attribute values is set up first; it follows the determination of parameters for this model by using the EM algorithm; finally, the optimal number of intervals is found by use of the Bayes Factor. The supervised algorithm WILD, Weighted Information Loss Discretization, can be considered as an extension of Decision Tree Discretization algorith, but uses a bottom-up paradigm as in ChiMerge algorithm.Secondly, this paper formulates the structure learning of Directed Graphical Model as determining the structure that best approximates the probability distribution indicated by the data. Maximum Mutual Information Metric is summarized and analyzed, and a new metric, Penalized Mutual Information metric, is proposed, then an evolutionary algorithm is proposed to search the best structure among alternatives. Two kinds of prior knowledge are incorporated to improve the efficiency. Several repair operators are designed to assure that each structure generated during theevolution is a valid DAG, and do not violate the prior knowledge.Thirdly, this dissertation proposed a Hybrid Computational Intelligence approach to learn the parameters of Directed Graphical Model. Firstly, an artificial neural network to used to represent the local conditional probability distribution between a node and its parents, this representation not only avoids the need of prior parameters when parameter learning, but also evades the local independence assumption between parameters. Then, an evolutionary algorithm is designed to train the neural network, the parameters of Directed Graphical Model can be induced from the trained neural network related to each node..Following that, the problem of model explanation is discussed. Several approaches for the intepretation of the probabilistic dependency relation and conditional independence relation in natural language are proposed.Finally, the design and implementation of a propotype system, Dr. Miner, are described.
Keywords/Search Tags:Graphical Model, Directed Graphical Model, Knowledge Discovery, Probabilistic Dependency Relation, Computational Intelligence
PDF Full Text Request
Related items