Font Size: a A A

Research On Accelerated Relief Algorithm Based On Information Granulation

Posted on:2021-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2428330626955478Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,the diversification of information and the speed of information generation lead to the explosive growth of data.A large amount of data is bound to contain a lot of valuable information,data mining is the process of“sifting sand for gold” in these data.In the field of data mining,classification problem is a widely concerned problem.Feature selection is an important pretreatment process in data mining,it can improve model accuracy,reduce features and improve operation efficiency by eliminating redundant or irrelevant features,so as to facilitate researchers to obtain useful information.Relief algorithm and its derivatives have been demonstrated to be a class of successful feature selectors.Different from global search and heuristic search methods,Relief algorithm measures the ability of features to distinguish samples based on classification interval,which is a simple and effective feature weighting method.However,when processing large amounts of data,the computational cost is still high.Therefore,how to establish a model that makes the Relief algorithm applicable to data of all sizes is also a research focus.Based on information granulation,combining with the inherent feature weighting mechanism of the Relief algorithm,this paper carried out relevant research on improving the efficiency of the Relief algorithm from the perspective of sample granulation and support vector granulation.The main work is summarized as follows:(1)From the perspective of sample granulation,combined with the potential relationship between the feature weighting mechanism of Relief algorithm and the existence of sample space,a fast Relief algorithm based on sample granulation was proposed.The algorithm to overcome the limitation of traditional Relief algorithms rely on all data.Knowledge granularity and Shannon entropy as evaluation index and the original datawas compressed reasonably from the perspective of information granulation,so as to narrow down the sampling.Experiments show that compared with the existing Relief algorithm,the proposed algorithm can significantly reduce the running time while maintaining the performance of the subsequent classification algorithm.(2)From the perspective of support vector granulation,a fast Relief algorithm based on support vector graining is proposed.This algorithm takes random samples as the entry point,discusses the relationship between support vectors and classification decision plane,obtains all support vectors as the sampling range,granulates support vectors and extracts a small number of samples for subsequent operations.Experiments show that the time efficiency of this algorithm is obviously better than the existing Relief algorithm.
Keywords/Search Tags:Feature selection, Relief, Information granulating, SVM, Information entropy
PDF Full Text Request
Related items