Font Size: a A A

Research On Concept Drift Detection And Rough Set Model Expansion

Posted on:2019-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:K W LuFull Text:PDF
GTID:2428330548987453Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing development of internet technology,researchers have become increasingly passionate about data exploration.Currently,"Concept drifting" is a popular research direction and is increasingly concerned by relevant industry personnel.The phenomena of conceptual drift and many uncertain changes exist in big data and data streams.Data heterogeneity is a potential feature in big data.Currently,most of the related research is limited to isomorphic data.It does not take into account the situation of heterogeneous data(different condition attribute sets)drifting in the true sense,and rarely use real experimental data to corroborate the opinion of the corresponding article.One of the research focuses of rough set theory is uncertainty analysis.Except for F-rough set models,the existing rough set theory models are almost static or semi-static,and most of these models can be applied to the study of various uncertainties,but in dealing with these uncertainties brought about by the change cannot be used to detect concept drift.The research content of this dissertation:On the one hand,it focuses on the study of heterogeneous data,discusses the essence of attribute reduction of rough set,and proposes the definition of concept attribute reduction,which unifies Pawlak reduction,value reduction and parallel reduction.The properties of concept attribute reduction are studied,and attribute reduction methods and concept drift detection methods of heterogeneous data are proposed.In the article,we also test the concept drift of heterogeneous data by experiment,and through the real and effective heterogeneous data to improve the article's point of view,theoretical analysis and experimental operations to prove the effectiveness of the method in the paper,which is a rough set and granular computing into the era of big data provides a new way.On the other hand,this paper combines the basic ideas and properties of Rough Sets,Concept Drift,Granular Computing,Data Stream and F-Rough Sets.Based on the upper approximation and the lower approximation of Rough Sets,a complete definition of Entire-Granulation Rough Sets,The definitions of upper approximation concept drift,lower approximation concept drift,upper approximation concept coupling and lower approximation concept coupling are also discussed.Meanwhile,their properties are discussed in detail,and the global changes of concepts in knowledge system are analyzed.This paper discusses the possible changes in the concept of the knowledge system,the use of the upper approximation cluster,lower approximation cluster as a theoretical tool.The identities and differences of concepts in different situations are described from the perspective of graphics.At the same time,the concepts of Entire-Granulation positive region in the decision table are also proposed and their related properties are studied.On the basis of Entire-Granulation rough set,Entire-Granulation absolute reduction,Entire-Granulation reduction and Entire-Granulation Pawlak reduction are also proposed and their related properties are discussed.The advantages and disadvantages of attribute reduction are also discussed.It is found that the attribute reduction makes the representation of the concept unusually simple and the redundancy attribute makes the concept representation more rich and diversified.This article analyzes the locality and the overall situation of human cognition world from the perspective of cognitive theory and other basic tools such as rough set theory.Entire-Granulation rough sets can also imply the complex diversity,inaccuracy,hierarchy and dynamics of human cognition under some specific contexts.For the Entire-Granulation rough set proposed and its concept of drift properties can provide some help to simulate artificial intelligence.The innovation of this article can be summarized as follows:1.Define attribute reduction of concept(concept cluster),unify value reduction,Pawlak reduction and parallel reduction so that attribute reduction can not only be aimed at data with the same structure but also be easily implemented on heterogeneous data.2.Proposed the concept of heterogeneous data drift detection method,defined the concept of mass concept of drift and the concept of drift,at the same time gives the calculation method.3.The concept and representation of Entire-Granulation rough set are proposed,and the Entire-Granulation rough set is depicted by means of graph,and the concept drift of Entire-Granulation rough set is studied.
Keywords/Search Tags:concept drifting, granular computing, entire-granulation rough sets, heterogeneous data, attributes reduction, upper approximation, lower approximation
PDF Full Text Request
Related items