Font Size: a A A

Data Analysis And Learning With Epistemic Uncertainty Modelled By Belief Functions

Posted on:2017-05-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:L Y MaFull Text:PDF
GTID:1108330485451541Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of information science and technology, more and more data become accessible, a great many of which are with epistemic uncertainty, i.e., they may be imprecise, uncertain or unreliable. The proper expression and reasonable analysis of epistemic uncertainty are drawing increasing attentions in the past decades. As a gen-eral framework for modelling and reasoning with uncertainty, belief function theory has been widely researched and applied since it was proposed. Thanks to the persis-tent efforts of scholars, nowadays the research of belief function theory moves on to a new stage. The topic of evidential statistical inference returns to life in recent years especially after the year 2010, opening up a new world for the theory.As a new research field, there exists many research gaps to be filled in the area of statistical inference with belief functions. Due to the close relation between statistical inference and machine learning, the decision trees whose structures are simple, clear and easy to interpret are chosen as the research object. To analyze learning procedures of belief decision trees systematically, classification trees with discrete outputs, regres-sion trees and linear model trees with continuous outputs, are all extended to training sets with epistemic uncertainty. The achieved belief decision trees can reduce the preci-sion requirement of training sets and therefore take better advantage of existing data. In addition, fusion of continuous belief functions is also discussed. Generating discount-ing factors from reliability of information sources, belief functions are combined more reasonably in the continuous case. The constructions of belief regression tree and belief linear model tree are proposed for the first time, as well as the research of querying while learning belief classification trees to reduce uncertainty, data quality measurement via evidential likelihood and contextual discounting of continuous belief functions.Decision trees are comprehensively extended to the case of uncertain training in-stances modelled by belief functions, aiming at learning models with good performance from low-quality data. Considering the existence of epistemic uncertainty, for classi-fication problems, data quality of training set is firstly evaluated. Based on evidential likelihood, inconsistency, uncertainty and scale of data are measured by values split from non-specificity. Since the normalized evidential multinomial likelihood equals the contour function of a likelihood-based consonant belief function, the flatness of con-tour function can be quantified by the non-specificity of a consonant function. Due to the concavity of evidential multinomial likelihood, two approaches of consonant belief function construction are proposed, as well as some practical ways to calculate non-specificity and separate it into different measures. In this way, data inconsistency, data uncertainty and data scale are quantified.For training instances whose outputs are uncertain class labels, the active belief classification tree learning method is proposed. It can not only handle uncertain data, but also reduce epistemic uncertainty by querying the most valuable uncertain instances within the learning procedure. Information entropies are hard to calculate because of the existence of uncertain labels, therefore entropy intervals are achieved from evidential likelihood function. Comparing the information gain ratio intervals obtained via interval operations, the best splitting attribute can be selected. When no dominant attribute exists, a query strategy which reduces the interval widths by querying the true labels of some valuable uncertain instances and then helps get the dominant attribute is proposed. The experiments on UCI data sets show that active belief classification tree performs well in various uncertain cases (vacuous, imprecise, uncertain, noisy, etc.)Decision trees become regression trees when the outputs are continuous variables. Inducing uncertainty into training instances, belief regression trees that have consonant models at leave nodes and belief linear model trees whose leave nodes have linear re-gression models are respectively constructed. The E2M algorithm is derived to achieve the parameters of linear regression model. Two general error calculating methods for uncertain continuous variables are proposed, which are based on evidence distance and weighted intervals relatively. Comparing all candidate splitting attributes and all pos-sible splitting values at a decision node, the one that maximizes the error reduction is chosen for splitting into two child nodes. Iterating this process leading to a good par-tition of the uncertain continuous instance space. The proposed belief regression trees and belief linear model trees can better handle the common situations such as interval-valued data, uncertain data and unreliable data.Meanwhile, the fusion procedure of continuous belief functions is also analyzed. For mass functions with finite interval-valued focal sets, a distance measure is defined accounting for the interaction between intervals of focal elements. Utilizing the dis-tances among pieces of evidence coming from different sources, the discounting rates are generated which help reduce the conflict during combination. More generally, for belief functions with infinite interval-valued focal sets (i.e. basic belief density), the procedure of contextual discounting is discussed. Considering the fact that information sources work better under different situations, the evidence that an information source provides is discounted according to the meta-knowledge of its source reliability and then combined with others.
Keywords/Search Tags:epistemic uncertainty, belief function theory, decision trees, information fusion, evidential likelihood function, continuous belief function
PDF Full Text Request
Related items