Font Size: a A A

Research On Attribute Reduction Of Distributed Information System Based On Rough Sets

Posted on:2018-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:K WangFull Text:PDF
GTID:2348330569986433Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In reality,data is often distributed across multiple databases in the network,traditional data processing methods require data to be centralized for efficient processing,which is often subject to large-scale data,privacy preserving and other factors in many real-world problems.Therefore,this is a hot issue in the current data mining that how to deal with the distributed data effectively without the data centralization.Attribute reduction is an important research content in data mining,which can remove redundant or unimportant attributes and speed up the subsequent data processing.Researchers have studied the attribute reduction of centralized data and formed a relatively complete research result.In order to maintain the classification performance of the system before and after attribute reduction,for the attribute reduction of distributed symbolic data,researchers have put forward the corresponding theories and methods.However,there are seldom researches on attribute reduction of distributed continuous-valued data and distributed incomplete data.This thesis deeply studied attribute reduction of distributed continuous-valued decision information system and distributed incomplete decision information system,the main contributions can be summarized as follows:1.The attribute reduction of distributed continuous-valued data is studied.Firstly,the definition of neighborhood rough set in distributed continuous-valued decision information system is given.Then,the reducibility of attributes in distributed continuous-valued decision information system is discussed based on the precondition that the positive region of the system remains unchanged,and the attribute reduction algorithm of distributed continuous-valued decision information system is proposed.The experiment results show that the algorithm can effectively remove the redundant attributes in the distributed continuous-valued data,and keep the classification ability of reduced data the same as the data before reduction,or even better.2.The attribute reduction of distributed incomplete data is studied.Firstly,the definition of rough set in distributed incomplete information system is given based on tolerance relation and asymmetric similarity relation.Then,to keep the positive region of the distributed incomplete decision information system unchanged,the reducibility of attributes in distributed incomplete decision information system is discussed,and the attribute reduction algorithm of distributed incomplete decision information system is proposed.The experimental results show that the algorithm can effectively remove the redundant attributes in the distributed incomplete data,so that the integrated classification ability of incomplete distributed data before and after reduction is not much different.In addition,the change of data missing rate has strong effect on attribute reduction with the tolerance relation,but weak influence on attribute reduction with the asymmetric similarity relation.
Keywords/Search Tags:distributed data, continuous-valued, incomplete information system, data missing rate, attribute reduction
PDF Full Text Request
Related items