Font Size: a A A

Research On Dynamic Attribute Reduction Under Dominance-based Rough Sets Under Spark Framework

Posted on:2020-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:L Z YangFull Text:PDF
GTID:2428330590496530Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The main goal of Knowledge Discovery in Database(KDD)is to dig out valuable knowledge from the original knowledge base through a series of means such as transformation and extraction.Attribute Selection(Feature Selection,Feature Reduction)is an effective method to reduce the data dimension.It plays an important role in the data preprocessing process and has been widely used in image processing,text classification,speech recognition,ect.In recent years,the rapid development of information technology has brought unprecedented opportunities and challenges to various fields.Data sharing has made data types more various.How to extract valuable knowledge from these high-dimensional,complex and dynamic data has become one of important research contents focused by scholars.In this thesis,how to effectively obtain attribute reduce in the dominance based rough set is set as the starting point of the research.Then,the related research on the problem of obtaining dynamic attribute reduction in the dominance based rough set in the case of dynamically changing objects,attributes and attribute values in the dominance information system has been carried out.The main research contents of this thesis are as follows:1.The case of objects varing is discussed.The change rule of the dominanting sets whenadding a single object in the dominance information system is expounded,and theproposition of dynamic updating attribute reduction is given.The algorithm forupdating attribute reduction in dominance based rough set is designed when a singleobject is added into the information system.Experiments have been excuted on datasets downloaded form the UCI machine learning repository.The experimental resultsshow that the computational efficiency and classification accuracy has beenimprovement when comparing the incremental algorithm with the non-incrementalalgorithm.(Chapter 3)2.The case of attributes varing is discussed.The law of the change of the object'sdominant relation when adding a single attribute in the dominance informationsystem is expounded.A binary coding matrix based method for maintaining isproposed.An algorithm for dynamically updating attribute reductions is designedwhen a single attribute is added.Experimental results verify the effectiveness of theproposed incremental algorithm.(Chapter 4)3.The case of attributes' values varing is discussed.The influence of the attribute valueson the dominance relation is discussed when multiple attribute values in thedominance information system is discussed.A character combination coding methodis used to encode the dominant relation of objects.A method for dynamic updatingthe character combination coding matrix is given.The dynamic updating attributereduction algorithm based on the character combination coding matrix is designedwhen the attribute values change.The experiments on ten data sets is carried out toverify the effectiveness of the proposed algorithm.The experimental results show thatthe proposed the time consumption and classification performance of incrementalalgorithm is superior to that of non-incremental algorithm.(Chapter 5)4.The problem of parallel attribute reduction of dominance based rough set under Sparkframework is analyzed.Then,the parallel strategy of heuristic attribute reduction forbig data is given.The experimental results show that Spark can effectively deal withthe big data when usting heuristic attribute reduction policy.(Chapter 6)...
Keywords/Search Tags:Attribute reduction, Dominance-based rough sets, Incremental learning, Parallel computing, Spark
PDF Full Text Request
Related items