Font Size: a A A

Research On Cost-Sensitive Granular Computing Approaches

Posted on:2019-06-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:S J LiaoFull Text:PDF
GTID:1318330569987466Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cost-sensitive learning is one of challenging issues in data mining and machine learning.It takes into account the real cost information in data processing,thus has strong practical significance.Granular computing is a new research method of information processing and knowledge discovery.It can transform a complex problem into several relatively simple problems,thus facilitates the analyzing and solving of the complex problem.Combining the cost-sensitive learning with the granular computing is an effective way to solve practical problems.Based on the theories of neighborhoods and rough sets,this dissertation studies some key issues of cost-sensitive granular computing.The mainly innovative results are outlined as follows:1.Put forward a series of approaches which simultaneously select attributes and attribute-value granularities based on measurement errors and variable costs(For simplicity,the phrase “attributes and attribute-value granularities” is sometimes denoted as“attribute & granularity” in this dissertation).Among existing cost-sensitive attribute selection(attribute selection is also called attribute reduction in rough sets)papers,only a small fraction involves the selection of attribute values' granularity,and all of them consider only single-granularity and are not related to hybrid data and cost constraint.Besides,the test cost of an attribute and the misclassification cost of an object are often supposed to be fixed values in existing cost-sensitive learning.In fact,in real world the test cost is usually variable with the granularity of attribute values,while the variability of the misclassification cost is related to the object dealt with.According to the reality,this dissertation uses the confidence level of attribute values' observational errors to measure the attribute-value granularity.The relationship between the error confidence level,the test cost and the misclassification cost is taken into consideration,and the computation method of average total cost(namely the average value of total cost for the objects in the universe)consumed in data processing is discussed.On this basis,aiming to minimize the average total cost,the measurement errors and variable costs based approaches,which simultaneously select attributes and attribute-value granularities,are studied from the perspectives of both the single-granularity and the multi-granularity.The proposed approaches are stated as follows:(1)An error-confidence-level-based adaptive neighborhood model is constructed for hybrid data,and several kinds of variable test cost functions and misclassification cost functions are discussed according to reality.On this basis,effective single-granularity algorithms which simultaneously select attributes and the attribute-value granularity are proposed for the cases where the test costs are limited and not limited respectively.Experiments undertaken on multiple UCI datasets validate the effectiveness of the algorithms.In particular,the influences of different cost settings to the selected optimal attribute set and optimal attribute-value granularity are also explored in the dissertation,which provides feasible schemes for decision making.(Chapter 3)(2)Considering that different attributes may have different attribute-value granularities in real applications,this dissertation initially studies the multi-granularity cost-sensitive approach which simultaneously selects attributes and attribute-value granularities.An error-confidence-level-vector-based neighborhood rough set model is constructed,and the significance function is presented for any attribute & granularity pair(namely the ordered pair of attribute and attribute-value granularity).On this basis,an efficient multigranularity algorithm which simultaneously selects attributes and attribute-value granularities is put forward.Experimental results show that the proposed multi-granularity approach significantly outperforms existing single-granularity approaches on solving the practical problems.(Chapter 4)2.Present the inconsistent-neighborhood-based attribute reduction approaches in neighborhood rough set.Firstly,the concept of inconsistent neighborhood is introduced,and the relations between it and existing fundamental concepts in neighborhood rough set are discussed.Consequently,new formulations are obtained for lower and upper approximations,positive region and boundary region,etc.Then,efficient cost-insensitive and test-cost-sensitive attribute reduction algorithms are respectively designed by using the properties of inconsistent neighborhood.It is found from the theoretical and experimental analyses that,to some extent using inconsistent neighborhoods is advantageous over using traditional neighborhoods in the relevant computations of neighborhood rough set.(Chapter 5)3.Propose the cost-sensitive attribute reduction approach in decision-theoretic rough set(DTRS).The positive region in DTRS may contract with the addition of attributes,thus it is difficult to conduct attribute reduction in DTRS.Through considering this characteristic of DTRS as well as decision costs and test costs,this dissertation constructs the theoretic model for the cost-sensitive attribute reduction approach in DTRS,especially giving the cost-sensitive attribute-subset significance function.In addition,a backtracking attribute reduction algorithm and a heuristic one are proposed respectively,which aim at minimizing the average total cost.Experimental results demonstrate the effectiveness of the proposed algorithms.(Chapter 6)To sum up,this dissertation studies some key problems about cost-sensitive granular computing.For each problem,we build the theoretical model,propose effective algorithms,and do the experiments for validation.The study enriches the theoretical and methodological system of cost-sensitive granular computing,and lays an important foundation for our further work.
Keywords/Search Tags:cost-sensitive, granular computing, attribute selection, attribute-value granularity selection, rough set
PDF Full Text Request
Related items