| The high dimension problem is an important problem in pattern recognition and machine learning.’ Severe feature redundancy and high noise are the fundamental reasons for the difficulty of high-dimensional data analysis.The large number of redundant features and noise is not only lead to a dramatic increase of the computational time cost,but also can negatively affect the generalization performance of data analysis methods.In addition,the collinearity of features that occurs among the large number of redundant features may also lead to the errors of model selection in high-dimensional data analysis.Feature selection and feature extraction can effectively deal with these problems in high-dimensional data,which have become an indispensable and important part of high-dimensional data analysis.With the rapid growth of high-dimensional data in different fields,the research of feature selection methods is paid more and more attention,especially the more efficient feature selection methods.In this paper,we use several models based on correlation analysis to study several effective new feature selection methods.The main work and innovation of this paper are as follows:1.A feature selection method based on maximization of related information(MCI-RFE)is proposed.This method evaluates the importance of each feature by maximizing the correlation between the feature space and the class coding space.The more important the contribution of the feature to the correlation is.MCI-RFE can quickly remove the irrelevant features and remove redundancy,which can improve the classification and recognition performance of the classifier quickly(with low time complexity).2.The feature selection method of the importance of projection on several orthogonal components based on feature space is proposed.This method extracts the importance of some orthogonal components from the feature space according to the correlation between the feature space and the coding space of the class.The multi-component extraction is designed to improve the robustness of the feature selection algorithm and increase the resistance to noise The3.The recursive feature elimination(RFE)strategy is introduced to the maximum relevant information(PMCI),and a feature selection method based on partial maximum correlation information(PMCI-RFE)is given.Experiments show that the algorithm PMCI-RFE has better computational efficiency for multi-class high-dimensional data.RFE can effectively eliminate the redundant features to achieve the best recognition performance when the feature subset is smaller.Meanwhile,the statistical test also shows that the PMCI-RFE method has good robustness.The proposed method is validated by the identification of protein structure and the identification of microarray data.The proposed method can be used for high dimensional biological data analysis,assisted biomedical information mining.The method of this paper can also be used in other areas of high-dimensional data analysis. |