Font Size: a A A

Granular Data Description For System Modeling And Data Mining

Posted on:2019-04-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:X B ZhuFull Text:PDF
GTID:1368330575980683Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
As data generated by various services of era of information,such as network analysis,on-board service system,medical service and e-commerce keep growing,how to analyze and use these data to develop commercial service strategy,to provide users with a variety of suggestions or reveal the internal structure of these data has become a challenging problem.Various intelligent systems have been developed to deal with this challenge and have achieved good results.An excellent intelligent system should be able to effectively conduct two-way communications with the user,seamlessly accept requests from the human and convey the results in a transparent,meaningful format.Besides the numeric data,this kind of systems should also be able to cope with nonnumeric evidences,such as users' opinions and judgments.This kind of systems should be built upon more abstract entities than plain numbers-information granules.Obviously,the results are conveyed to users in the form of information granules,which are more vivid than plain numbers.In response to these growing challenges,Granular Computing(GrC)emerged a decade ago as a unified conceptual and processing framework.This dissertation intends to establish a unified conceptual and algorithmic framework based on granular computing.To this end,it is necessary to solve a series of problems,such as encoding and decoding of information granules,the construction of information granules,the optimal allocation of information granularity,and the construction of granular fuzzy models,etc.Information granules are generic building blocks supporting processing realized in Granular Computing and facilitating communication with the environment.In this dissertation,we are concerned with a fundamental problem of the encoding–decoding of information granules.The essence of the problem is outlined as follows: given a finite collection of granular data X1,X2,…,X N(e.g.sets,fuzzy sets,etc.),construct an optimal codebook composed of information granules A1,A2,…,A c,where typically c << N,so that any Xk represented in terms of Ai's and then decoded(reconstructed)with the help of this codebook leads to the lowest decoding error.A fundamental result is established,which states that in the proposed encoders and decoders,when the encoding-decoding error is present,the information granule coming as a result of decoding is of a higher type than the original information granules(say,if Xk is information granule of type-1,then its decoded version becomes information granule of type-2).We develop decoding and encodingmechanisms by engaging the theory of possibility and fuzzy relational calculus and design a performance index by engaging a distance between the bounds of the interval-valued membership function which helps the codebook to generate the lowest decoding error.This study is also concerned with a design of a collection of meaningful,easily interpretable ellipsoidal information granules with the use of the principle of justifiable granularity by taking into consideration reconstruction abilities of the designed information granules.The constructed information granules help us to better understand the topology of the original experimental data and can also serve as generic building blocks of Granular Computing.The principle of justifiable granularity supports designing of information granules based on numeric or granular evidence,and aims to achieve a compromise between justifiability and specificity of the information granules to be constructed.A two-stage development strategy behind the construction of justifiable information granules is considered.First,a collection of numeric prototypes v1,v2,…,v c is determined with the use of fuzzy clustering.Second,the lengths of the semi-axes of ellipsoidal information granules V1,V2,…,V c to be formed around such prototypes v1,v2,…,v c are optimized.The constructed information granules can help to reveal the topology of the original data set and can provide more abundant information than numerical prototypes.We propose an alternative augmented way of building information granules by generating hypercube-like information granules.A collection of hypercubes is referred to as a family of ?-information granules.This family is constructed around numeric prototypes generated through a modified version of FCM whose running time is linear with respect to the number of clusters v1,v2,…,v M.Next by admitting a certain level of information granularity(?)(0<?<1),a collection of hypercubes V1,V2,…,V M is formed around these prototypes.The quality of information granules realized in this way is assessed by involving them in the granulation – degranulation process and determining a value of the coverage criterion.The level of information granularity and the number of the granular prototypes in the family of ?-information granules form an important design asset directly impacting the obtained coverage level of the data.The computational facet of the approach is stressed.It has been demonstrated that the granular enhancements of the description of data come with a very limited computing overhead.Among various fuzzy models,Takagi-Sugeno(TS)fuzzy models form one of intensively studied and applied categories of models.We concern with a development of a granular TS fuzzy model realized on a basis of numerical evidence and completed through acombination of fuzzy subspace clustering and the principle of optimal allocation of information granularity.The TS fuzzy models are built with the use of the fuzzy subspace clustering algorithm.Information granularity is regarded as a crucial design asset whose optimal allocation gives rise to granular fuzzy models and makes the constructed models become better in rapport with experimental data.In comparison with fuzzy models,granular fuzzy models produce results(outputs)that are information granules(which can be interpreted easily)rather than numeric results being encountered in fuzzy models.In numerous real-world problems,we are faced with difficulties in learning from imbalanced data.The classification performance of a “standard” classifier(learning algorithm)is hindered by the imbalanced distribution of data.In this thesis,a novel granular under-sampling method is proposed.The method exploits the concepts and algorithms of Granular Computing.First,information granules are built around the selected patterns coming from the majority class to capture the essence of the data belonging to this class.In the sequel,the resultant information granules are evaluated in terms of their quality and those with the highest specificity values are selected.Next,the selected numeric data are augmented by some weights implied by the size of information granules.Finally,Support Vector Machine and K-Nearest-Neighbor both being regarded here as representative classifiers,are built based on the weighted data.The experimental results quantify the performance of Support Vector Machine and K-Nearest-Neighbor with granular under-sampling method and demonstrate the superiority of the performance obtained for these classifiers endowed with conventional under-sampling method.In general,the improvement of performance expressed in terms of G-means is over 10% when applying granular under-sampling compared with random under-sampling.Information granules are constructed based on original numeric data at a higher abstract level.Through a combination of the existing research results of granular computing,such as the principle of justifiable granularity and the optimal allocation of information granularity,we solve a series of basic issues of granular computing,such as makes the information granules have a good readability through a reasonable coding-decoding mechanism;information granules constructed by adopting the principle of justifiable granularity can help reveal the characteristics of the original data;fuzzy models built with the help of optimal allocation of information granularity can produce good prediction results and convey the results to users in a user-friendly manner.We believe the solutions to a series of frontier and hot issues for granular computing that can provide newinspiration and promote the development of granular computing in the future.
Keywords/Search Tags:Granular Computing, Information Granules, Information Granularity, Encoding-decoding mechanisms, Granular Fuzzy Model
PDF Full Text Request
Related items