With the development of science and technology,the technologies such as fingerprint,face recognition,iris,DNA and remote sensing are in the ascendant.The data collected is increasingly huge,but at the same time the target information gets less.Typical data analysis and processing algorithms often fail to achieve the desired results or even fail completely when analyzing the data.The general approach is to extract or select features from the dataset before data analysis and processing,eliminate irrelevant features and noise features,extract the most informative and most relevant features,and provide convenience for later workflow.Based on the analysis and study of commonly used feature extraction methods,this paper proposes some effective algorithms to overcome the problem of small samples and high computational complexity.The main contents are as follows:1.An algorithm of FPCR-ISODATA(i.e.Forward Principal components rotationiterative self-organizing data analysis techniques algorithm)based on PCA algorithm is proposed.The core of the algorithm is a combination of pre-principal component rotation and iterative self-organizing data analysis algorithms,and it is successfully applied to remote-sensing data processing to extract the object of interest.Due to the correlation of adjacent spectral data(such as visible light),the principal component(PC)rotation is introduced to generate irrelevant component output,and ISODATA is used to distinguish different types of ground,to achieve the purpose of automatically extracting different ground objects.There are broad application prospects for real-time monitoring of environmental remote sensing.2.The problem of insufficient sample size in data collection is widely existed in practice.Based on singular value decomposition(SVD)and independent component analysis(ICA),a new feature extraction method was proposed: decision-variable analysis(DVA).After removing the influence of the phenotype of interest,independent component analysis is performed on the residual data to extract key feature information.Applying it to gene data whose number of features is much larger than the sample size,the proposed algorithm can effectively extract important features and contribute to the identification of the target gene(pathogenic or cancerous gene). |