Font Size: a A A

Research On Approaches For Knowledge Discovery From Dynamic Ordered Data

Posted on:2020-11-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:S WangFull Text:PDF
GTID:1488306473972089Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of big data,the rich knowledge hidden in all kinds of data is actually helpful to solve real-world problems.As one of the important data,the ordered data widely exists in practical applications.The practical massive ordered data probably varies rapidly with time,which makes the accurate and efficient knowledge discovery and update from the data be highly challenged.Moreover,for the rapidly varied ordered data from multiple data sources,it is also necessary to update and fuse the obtained knowledge efficiently.Therefore,knowledge discovery from dynamic ordered data is a key step and one of the important fields in data mining.Usually the massive dynamic data is complex and uncertain,but fortunately the granular computing theory and rough set theory are effective tools to deal with complex and uncertain problems.Granular computing theory can decompose a complex problem into layer-wise steps,so as to solve it relatively easily.This is suitable for dealing with the complexity of massive dynamic ordered data.Rough set theory can effectively process vague,inaccurate or uncertain problems by existing data without any priori knowledge.This is suitable for dealing with the un-certainty of massive dynamic ordered data.The works in this dissertation focus on single-source and multi-source ordered data with multi-dimensional variations.Based on granular comput-ing and rough set theories,the approaches for efficiently obtaining and updating knowledge in single-source and multi-source ordered data with multi-dimensional variations are studied by employing and improving the dominance-based rough set model,as well as combining set method,matrix method and incremental learning.The contributions are as follows:(1)When objects and attributes increase simultaneously in an ordered information system,computing rough approximations of the dominance-base rough set is simplified by improv-ing the definition of the P-generalized decision.In order to avoid repeated comparisons be-tween old attributes,a notion of the dominance feature matrix is defined.Then,a model for dynamically updating rough approximations is built based on the improved P-generalized decision and the dominance feature matrix.An incremental approach for dynamically up-dating rough approximations is proposed and the corresponding algorithm is designed.In addition,for practically applying the algorithm,a storage strategy for the matrix is presented to reduce memory consumption.(2)When object set and attribute values vary simultaneously in an ordered information system,the properties of the P-generalized decision are analyzed and revealed.Then,two novel notions,namely the P-generalized decision upper domain and lower domain are defined.These two notions essentially reflect the practical dominance dependency between the ob-jects,so as to avoid unnecessary comparisons between a considerable number of objects,and significantly improve the efficiency of computing rough approximations.For the cases of the simultaneous variation of object set and attribute values,based on the P-generalized decision upper domain and lower domain,two approaches for efficiently updating rough approximations are proposed,respectively.These two approaches are applicable not only to the simultaneous variation of object set and attribute values,but also to the individual variation of object set.(3)When attributes increase and attribute values vary simultaneously in an ordered informa-tion system,a strategy combines the dominance feature matrix with the P-generalized de-cision domains is proposed.In order to make the dominance feature matrix obtained by the P-generalized decision domains be dominance symmetrical,the definitions of the P-generalized decision upper domain and lower domain are improved.In this combinational strategy,the P-generalized decision domains are employed to obtain the dominance fea-ture matrix only by comparing limited objects which are practically dominance dependent.Then,the dominance feature matrix is employed to avoid repeated comparisons between old attributes.As a result,a model for dynamically updating rough approximations is built,and an efficient approach for updating rough approximations is proposed.This approach is widely applicable not only for processing simultaneously increased attributes and var-ied attribute values,but also for processing individually increased attributes or individually varied attribute values.(4)For the single-source dynamic ordered data with simultaneously increased objects and var-ied attribute values,a parallel approach based on granularity decomposing is proposed to efficiently update rough approximations.By this approach,a single-source dynamic ordered information system is divided into multiple Basic Ordered Information Granules(BOIGs),and the P-generalized decision is combined with the BOIGs.Then,the P-generalized deci-sion is efficiently updated by parallel updating the P-generalized decision Local Information Granules(PLIGs)and the P-generalized decision Mutual Information Granules(PMIGs),as well as parallel fusing these two information granules,so as to efficiently obtain the updated rough approximations.This parallel approach can be directly employed in the multi-source dynamic ordered information system,namely,it is also suitable to process the multi-source dynamic ordered data with simultaneously increased objects and varied attribute values.In this dissertation,the efficient mechanisms for discovering and updating knowledge in single-source and multi-source dynamic ordered data with different two-dimensional variations are systematically studied,and the corresponding approaches are proposed.Moreover,a par-allel architecture for discovering and updating knowledge in single-source and multi-source dynamic ordered data with two-dimensional variation is introduced.The high efficiencies of the proposed approaches and their corresponding algorithms are evidenced by a number of ex-periments based on UCI datasets and artificial datasets.Based on the dominance-based rough set model,the works in this dissertation build the architectures for updating knowledge in mas-sive dynamic ordered data with multi-dimensional variations,and provide efficient approaches for updating knowledge.Furthermore,a mechanism for fusing knowledge in multi-source mas-sive dynamic ordered data is established,and an approach for fusing knowledge is provided.In addition,the improved definitions and the revealed properties in this dissertation can provide convenient methodologies and effective tools for other researches and applications based on the dominance-based rough set model.
Keywords/Search Tags:Knowledge Discovery, Granular Computing, Rough Set, Dominance Relation, Ordered Data
PDF Full Text Request
Related items