Font Size: a A A

Research And Application Of Dimension Reduction Method Based On Feature Selection

Posted on:2019-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:J ZengFull Text:PDF
GTID:2428330563991153Subject:Mechanical Manufacturing and Automation
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and information science,various types of intelligent terminal devices are widely used in daily life,and the data generated by them have shown an explosive growth.High-dimensional data disasters often occur during data mining.There is a large amount of irrelevant or noisy data in these data.Using these data to train machine learning models will lead to low computational efficiency,overfitting,and other problems.Data reduction is just right to solve this problem.In this paper,different types of data dimensionality reduction methods are studied,and the operating efficiency and application scenarios of the algorithms are discussed.Firstly,the theory of rough set theory and its attribute reduction principle are introduced,and combined with the feature subset search algorithm of particle swarm,the attribute reduction algorithm based on particle swarm optimization(PSOFS)is introduced.Considering that the position information of the population in the iterative search process is too concentrated in the particle swarm algorithm,it is easy to cause the algorithm to fall into a local optimum.Therefore,the rules of particle update are improved.An attribute reduction algorithm for quantum particle swarms based on gene crossover and mutation update rules(QPSOFS)is achieved.The algorithm search domain is expanded,and the disadvantages of local optimization in PSO algorithm can be avoided.Experiments were performed using UCI classical data sets.The different performance of algorithms were discussed,and the effectiveness of the attribute reduction algorithm for quantum particle swarms was verified.Secondly,it introduces the related theories of random forests and the definitions of their importance,discusses the sensitivity of feature importance to algorithm-related parameters,noise features,and high correlation features,and verifies the stability of the importance evaluation of random forest algorithm.A heuristic feature selection method based on the importance of random forest(RFFS)is achieved.The validity of this algorithm is verified by UCI data set.Finally,the application of algorithmic examples is discussed.In the process of facial semantics recognition,semantic features were extracted using PSOFS,QPSOFS,and RFFS algorithms respectively,and a simple semantic recognition rule was established by using the optimal result from RFFS algorithm.In face image classification,RFFS solves the problem of low feature classification accuracy used PCA dimensionality reduction.Through application analysis,the effectiveness of the proposed algorithm is fully verified.
Keywords/Search Tags:Data reduction, Feature selection, Rough set, Particle Swarm Optimization, Random forest
PDF Full Text Request
Related items