With the rapid development of information acquisition and storage technology,multi-view data in various fields grows exponentially.Multi-view data usually refers to data containing heterogeneous features of multiple sources,which describes things from different perspectives and contains more accurate and comprehensive semantic information.However,the high dimensional features of multi-view data inevitably lead to higher computing and storage costs in the multi-view learning process.In addition,high-dimensional features may contain outliers,noise,and irrelevant and redundant features,which will also adversely affect subsequent learning tasks such as classification and clustering.Therefore,the unsupervised multi-view feature selection technique has attracted wide attention due to its effectiveness and better interpretability in processing high-dimensional features.Unsupervised multi-view feature selection aims to select important feature subsets from the original feature space to improve the performance of downstream tasks without using any reliable label information.Although the existing unsupervised multi-view feature selection methods have achieved rich results,there are still some problems to be solved.For example,the existing methods often ignore the higher-order complementary and consistent information between different views during the learning process,which leads to sub-optimal performance.In addition,most of the existing methods rely on constructing the similarity matrix in the original feature space to obtain reliable pseudo label information to guide the feature selection,while the large amount of noise in the original data will inevitably hinder the exploration of the reliable potential similarity structure.To solve the above issues,two general unsupervised multi-view feature selection frameworks are designed in this thesis.The main research contents are summarized as follows:(1)In this thesis,we present an unsupervised multi-view feature selection method based on consensus-guided low-rank tensor learning which integrates graph learning and feature selection into a unified framework.Specifically,our approach first learns a pseudo label matrix for each view by preserving the local manifold structure and then stacks them into a third-order tensor with a low-rank constraint to explore the higher-order association information between different views.Then,in order to exploit the consistent information among different views,this thesis uses the view-specific pseudo cluster label matrix to reconstruct the consistency graph matrix,and adds a reasonable rank constraint to make it have the optimal clustering structure.Meanwhile,this model uses the consensus graph with optimal cluster structure to obtain reliable pseudo label information and combines the sparse regression model to learn a high-quality feature selection matrix.In this thesis,two types of multi-view datasets,machine learning and single-cell multi-omics datasets,are used to fully verify the superiority of the proposed method.In addition,the case study on ovarian cancer dataset further confirms the applicability of the method in identifying non-redundant and representative features.(2)In this thesis,we propose a unsupervised multi-view feature selection model based on tensor robust principal component analysis and consensus graph learning.First,the model uses tensor robust principal component analysis to learn a set of noise-removing similarity matrices for each view,and the constructed low-rank tensor can well mine the high-order correlation between different views.Meanwhile,in a unified framework,a high-quality consensus similarity matrix is adaptively learned from the representation of each view to capture the shared local manifold structure among views.To enhance the discriminative ability of the feature selection matrix,we further impose a rank constraint on the consensus similarity matrix to obtain reliable pseudo cluster matrix.In this thesis,an effective alternate optimization algorithm is designed based on the alternating direction multiplier method to solve the objective function.Experimental results on six multi-view datasets confirm the superiority of our method. |