Font Size: a A A

Research On Structural Methods Of Knowledge Discovery

Posted on:2004-07-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:T WuFull Text:PDF
GTID:1118360092486458Subject:Computer applications
Abstract/Summary:PDF Full Text Request
As the development of science and network technology, the capabilities of both generating and collecting data have been increasing rapidly, so knowledge discovery from data set with huge amount of samples has become an important task of artificial intelligence. Decision tree, neural networks and Bayesian networks are the main tools of KDD. Traditional neural networks cannot satisfy the KDD's requirement that information must be supplied promptly because of the low speed of processing and the difficulty defining the structure and estimating the parameters. Professor Zhang Ling etc propose a structural method for machine learning that designs the networks with spherical domains, which cover training samples. This kind of classifications is efficient for data set with huge amount of data. The author studies this structural method combining with Rough set theory and SVM, the main work and results are the following:(1) Structural machine learning method based on covering domains designs networks according to sample data, which is suitable for data sets with multi-class and huge amount of samples for its efficiency. The author analyzes the algorithm and proposes some strategies of ameliorating covering domains constructing, power function and distance function. Experiments show that these ameliorations improve the performance of domains covering neural networks.(2)Since the selecting of samples and the learning sequence affect the performance of covering networks deeply, three sequence covering methods are given in the thesis. Although these learning orders are not the best, in our experiments, the accuracy of networks designed with sequence learning is above the average accuracy of random learning. Algorithms of incremental learning covering and covering domains pruning based on order learning are also proposed. These algorithms can reduce the number of covering domains and improve the sorting accuracy effectively.(3)The existing databases employ large numbers of attributes to describe objects for which the relative attributes are unknown. It is necessary to selectattributes for classifier to make it perform well. Rough set theory provides an important tool for feature selection. An artificial network (RCSN) combining with rough set theory and covering design algorithm is introduced, which reduces condition attribute using rough set theory and designs the structure of neural network with covering design algorithm. An example shows that the algorithm can keep the sorting accuracy and cut down the occupying of memory and the cost of data collecting. The framework of feature weighting for covering algorithm is also proposed.(4) Support vector machine (SVM) that maps the samples to higher dimensional space and constructs an optimal hyper-plane to classify two classes samples based on statistical learning theory, has high abilities of generalization and extension. The resemblance between SVM and covering algorithm is analyzed and the algorithm of structural learning based on sphere covering in characteristic space is brought forward. Experiments show that this kind of networks has the virtue of both covering design algorithm and SVM; the existence of hyper-plane is proved in training processing. Directed by the theory of quotient space, the author puts forward the notion and the algorithm of covering domains fusion which band SVM and covering algorithm together. The fusion algorithm not only simplifies the solution of SVM and improves the performance of covering algorithm but also provides academic foundation for covering algorithm.(5) A dual representation of machine learning in classifications is introduced by mapping the samples and hyper-plane into their version space respectively. Using the dual representation, the classification problems solved in the original feature space is transformed into that solved in the characteristic space, and the classification problem of a given sample set is transformed into one that is minimal reduc...
Keywords/Search Tags:knowledge discovery, structural machine learning, covering algorithm, feature selection, kernel function, fusion, dual algorithm
PDF Full Text Request
Related items