Font Size: a A A

Research On Algorithms For Relational Data Classification

Posted on:2010-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z SuFull Text:PDF
GTID:2178360302959423Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Typical data mining approaches look for patterns in a single relation. Whereas in the real world most data are stored in multiple tables and managed by relational database systems. As transferring data from multiple tables into a single one usually causes many problems, development of multi-relational mining algorithms become important and attracts many researchers'interests. Mining data that consists of complex and structured objects also falls into the scope of this field. In addition, classification is one of the most important tasks in data mining, therefore one has witnessed numerous studies targeting on solving relational data classification problems. While the existing classification algorithms are neither accurate nor efficient. In this work, we aimed at solving these problems and proposed two novel solutions to the task of classification for relational data.At first, to improve the Graph-NB algorithm's accuracy (which is a famous multi-relational classification algorithm), a new algorithm(ASNBC) is proposed,this algorithm has several innovations. Firstly, defining a new structure, namely the ESRG ( expanded semantic relationship graph ), which can not only describe the relationship between tables, but can also detailed what attributes in a table have influence on the classification task; secondly according different influence that had a table to the classification task, ASNBC algorithm defined three kinds of table and using different methods to select the table's attributes to build the ESRG. Experimental study on real world databases had shown its high efficiency and accuracy.The second, to use the advantage of the Neural Networks, such as Robustness and accuracy, a new algorithm RNC was proposed, which upgraded the Neural Networks to deal with the multi-relational 0-1 classification tasks, the RNC algorithm also has several innovations. Firstly, used the relational database's schema as an important prior knowledge to help constructing the Neural Networks, and in this way the new Neural Network's structure is very simple and easy comprehensive; Secondly given a score mechanism to weigh the degree how a target tuple belong to the positive class, Our comprehensive experiments on real databases demonstrate the accuracy of RNC.
Keywords/Search Tags:Data mining, Relational data, Classification algorithm, Bayesian classification, Neural Networks
PDF Full Text Request
Related items