Font Size: a A A

Research On Approaches For Dynamic Knowledge Acquisition From Incomplete Data

Posted on:2016-03-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:C LuoFull Text:PDF
GTID:1228330461474259Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Emerging information technologies and application patterns in modern information soci-ety are growing in an amazing speed which cause the advent of the era of Big Data. Exploring efficient and effective data mining and knowledge discovery methods to handle Big Data with rich information has become an important research topic in the area of information sciences. Data in Big Data environment is often fraudulent and incomplete, which brings uncertain risks to data analysis and modeling. On the other hand, the collection and analysis of Big Data is in a dynamic process of continuously optimizing. The size of data increases at an unprecedented rate, which makes an increasingly high demand of timeliness requirements of processing infor-mation in Big Data environment. Granular computing is a newly computing paradigm in the realm of computational intelligence, which offers a multi-view model to describe and process uncertain, imprecise, and incomplete information. According to the selection of a suitable gran-ule and the granularity conversion in the complex problem space, granular computing provides a granular-based approximate solution for mining massive data. As an important granular com-puting model for approximation problems with uncertain data, rough set theory describes an unknown target concept by using the already known concepts without any additional informa-tion about data. Based on the theories of granular computing and rough sets, this dissertation focuses on the development of approaches for dynamic knowledge acquisition from incomplete data. The main research works and innovations are presented as follows:(1) Dynamic probabilistic rough sets for mining incomplete data with lost values and "do not care" conditions are proposed. Different updating patterns of knowledge granularity and tar-get decision are investigated when adding and removing data objects, respectively. Incremental methods for estimating the conditional probability and updating probabilistic approximations in a probabilistic rough set model are introduced. Efficient incremental algorithms for dynam-ically updating probabilistic approximations from incomplete data are designed based on the proposed updating principles.(2) Considering the addition/removal of data objects in the set-valued decision systems, dynamic dominance-based rough set approaches for updating approximations are investigated, respectively. On the basis of different newly added data objects or removed data objects, the dynamic properties for updating rough approximations of upward and downward unions of decision classes in different manners are analyzed. Two efficient incremental algorithms for updating rough approximations in the set-valued decision systems are developed accordingly.(3) The dominance matrix in the set-valued information systems is constructed for ex-pressing the preference relation between objects intuitively. On the basis of Boolean column matrix representation of object set in an information system, a matrix-based approach is pre-sented for constructing rough approximations. Considering the dynamic attribute generalization in the set-valued information systems, the dynamic properties for updating dominance matrix are analyzed firstly when new attributes arrive and old attributes are forgotten. Then the in-cremental updating mechanisms of product matrix and induced matrix in the construction of rough approximations are proposed, respectively. Finally, matrix-based incremental algorithms for updating rough approximations are proposed in the set-valued information systems under the attribute generalization.(4) Motivated by the needs of updating data values with the dynamic change of decision criteria, dynamic rough set approach is proposed in the set-valued decision systems with the variation of data values. Incremental mechanisms for the updating of elementary granules in dominance-based rough sets are presented, which is based on the analysis of dynamic prop-erties preference relations between data objects. Furthermore, two incremental algorithms for computing rough approximations are proposed corresponding to the addition and removal of data values, respectively.In this dissertation, based on the theory of granular computing and the incremental learning technology, several efficient rough set approaches are introduced for mining incomplete data on the consideration of different data updating situations. Theoretic analysis and experiment evaluations carried out on both artificial and real-world data sets illustrate the effectiveness and efficiency of the proposed approaches. This study extends the research field of the theory of granular computing and rough set, and further enriches the research techniques of dynamic data mining, which is valuable for the analysis of Big Data.
Keywords/Search Tags:Knowledge Discovery, Granular Computing, Rough Set, Incremental Updating, Incomplete Data
PDF Full Text Request
Related items