Font Size: a A A

Research On Data Dimension Reduction Algorithm Based On Feature Selection

Posted on:2018-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:D L YuFull Text:PDF
GTID:2348330515479800Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The rapid development of computer technology has led to an explosive growth in information we obtain.A survey shows that the amount of data acquired by humans over the past half century is the sum of the data obtained in the past long history of human beings.We are surrounded by big data.These data are usually high-dimensional,the expansion of data dimensionality has brought a huge burden for the subsequent calculation task,leading to curse of dimensionality.To obtain valuable information from data,feature selection and data dimensionality reduction become one of the hotspots.The basic idea of data dimensionality reduction methods is to transform high-dimensional samples in input space into low-dimensional space,and finally get the low-dimensional representation of original data in the low-dimensional space.At present,data dimensionality reduction has become an important method in machine learning,data mining,artificial intelligence and computer vision.Based on the ReliefF feature selection algorithm,this thesis combines two different data reduction algorithms and the properties of submodular optimization,and studies the application of feature-based data dimension reduction algorithm in text and image feature selection.The main content and innovation of this thesis:(1)The method of feature selection based on PCA-ICA and ReliefF for face image feature selection is proposed.Considering the shortcomings of PCA algorithm can not preserve the high-order information of face images,in face image processing,the ReliefF algorithm is firstly used to select the optimal feature subset.After PCA is reduced,ICA is used to process the data after dimensionality reduction.And the image feature set after the operation of the two data reduction algorithm is sent to the classifier for training.The final experimental results show that the selected image feature subset used for the classification is better than the feature selection algorithm provided on the ASU.(2)A feature selection method based on submodular optimization is proposed for text feature selection.Firstly,the text feature set is preprocessed,and then the properties of the solution are maximized by the submodular function.The final feature subset is selected by the greedy algorithm.The final experimental result proves that the feature subset used for text feature extraction is better than the feature subset extracted by the feature selection algorithm provided on the ASU.
Keywords/Search Tags:Feature Selection, ReliefF Algorithm, Submodular Optimization, Image classification
PDF Full Text Request
Related items