Font Size: a A A

Research Of Classification Approaches Based On Multi-relational Transformation

Posted on:2013-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhangFull Text:PDF
GTID:2248330377960737Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of information technology has brought a giganticgrowth of the data. How to mine the worth, interesting and meaning knowledge touser are the important task of Data Mining. Meantime, it is focused on by more andmore researchers. Moreover, classification is the one of the important project ofdata mining. It is applied on the finacial decision making, medical research and soon.Structrue data is the main object of classifications, and it mainly comes fromthe relational data stored in the real world applications. These relational dataobatianed by relational database is more nature than the“flat”data. However, thetraditional classifications are only used on the data represented in single,“flat”relational form. Therefore, a bridge is necessary built bettwen multi-relational dataand single relational data. And then, the traditional classifications can be appliedon the single form data which is transformed by the bridge, and make the effectivepredicting. Based on the multi-relational transformation, we construct themulti-relational transformation model, and design two algorithms IWT and MRT.The algorithms make use of different strategies increasing the efficiency, workingout the statistics bias and the relevant problems and getting the better predictingperformance.The main contributions of this dissertation are as follows:(1) Construct the high efficiency link path. We rebuilt the link path byanalyzing the relation among attributions, which exist on the relations linking eachother. In addition, we make use of the breadth-first search to access the link path.The new link path owns the higher efficience.(2) Construct the selecting model of relations. On the global view, in themulti-relational data, the target relation is only one, but there are huge backgroundrelations in the relational database. However, the attributions, which are meaningfor classficating or the user making decision, are not distributed on everybackground tables. Thus, the selecting model is necessary in the transformation forimproving the mining efficiency and removing the redundancy relations.(3) Construt the feature selection function based on multi-relational data. Onthe local view, all attributes are not focused on by user or important to distinguish the class labels. So, the feature selection based on multi-relational data is useful toimproving the efficiency and predicting performace.(4) Analyze and process the statistics bias in the transformation process. Theone-to-many and many-to-one among the tuples of relations and the null valueexisted in the transformation process is the root cause that leads to be not consistanton evaluating the importance of attributes bettwen before and after conversion ofrelations. Thus, the instance weighting transformation and tuples transformationstrategies are proposed to keep the importance of attributes consistant bettwenbefore and after transformation. Our comprehensive experiments demonstrate thewell predicting performance of our methods, based on keeping the consistence.
Keywords/Search Tags:Data Mining, Multi-Relational Data Mining, Classification, FeatureSelection
PDF Full Text Request
Related items