Font Size: a A A

Maximum Information Gain Relief Algorithm And Its Application On Telecommunication Data Feature Selection

Posted on:2021-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2428330623978265Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
With the continuous development of the information society and the arrival of the era of big data,the rise of artificial intelligence(AI)is dramatically changing our cognition and life.Behind artificial intelligence is complex data mining technology and the deep research of machine learning.In the practical application of machine learning,too many data features will easily lead to“dimensional disaster”,which reduces the efficiency of data analysis and makes the time required for model training too long,the model is too complicated,and the promotion ability is reduced.Feature selection can eliminate irrelevant or redundant features,reduce the number of features,improve model accuracy,reduce running time,in addition,selecting truly relevant features can simplify the model and reducing the number of features is more easier to understand the data generated process.Classic Relief algorithm is a widely used filtering feature selection method.It assigns features different weights according to the correlation between each feature and category.Features with weight less than a certain threshold will be removed while feature selection.The correlation between features and categories in classic Relief algo-rithm is based on features,ability on distinguishing close-range samples.The running time of the classic Relief algorithm increases with the number of samples and original features,so it is very efficient.As a series of algorithms,Relief algorithm includes the first proposed classic Relief and later expanded Relief-F and RRelief-F.The earliest proposed classic Relief algorithm is for the two-class problem,and Relief-F algorithm can solve the multi-classification problem,and the RRelief-F algorithm is aimed at the regression problem with the target attribute being a continuous value.Maximum entropy Relief feature weighting,referred to as ME-Relief method,combining spacing maximization and maximum entropy principle,has better adaptability and robustness.For the gradual increase of the data set,the ME-Relief algorithm is extended to the online version,can handle multi-category data and online data.This paper proposes a new algorithm of maximum information gain Relief,name-ly MIG-Relief.First construct a new objective function based on information gain and spacing maximization,then propose a new fuzzy difference degree measurement function with better smoothness in the objective function,reduce the influence of pa-rameters on the optimization objective function and improve the adaptability of the new algorithm to data.In addition,the mathematical form and application of the new algorithm have also been studied in more detail.
Keywords/Search Tags:Relief algorithm, Information gain, Feature weighting, Feature selection
PDF Full Text Request
Related items