Font Size: a A A

Research Based On Joint Embedded Learning And Regression Methods And Its Application In Cancer Omics Data

Posted on:2020-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:S S WuFull Text:PDF
GTID:2430330575957144Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Cancer data is usually high-dimensional small-sample data that is difficult to mine,and some key information of cancer is hidden in these high-dimensional data.In order to mine the key information in these data,it is important to effectively reducing the dimension,which has become a hot spot.In bioinformatics,feature selection is a widely used dimensionality reduction method,such as Joint Embedding Learning and Sparse Regression?JELSR?.However,the traditional feature selection methods have some drawbacks in existing cancer data analysis.First,there is a large number of noise and redundant values in the genomic data,which leads to the sparsity of the algorithm is far from satisfactory.Second,the error term is constrained by the square term,which makes the algorithm extremely sensitivity to noise values and outliers,and reduces the performance of the algorithm.Third,the methods usually use a single view datum,which ignoring the effect of other views on the data.Moreover,the conventional term uses a sparse regression constraint,which ignoring the inherent structure of the data.Therefore,in view of the problems existing in the traditional methods,this paper based on the JELSR model has made the improvements from the perspective of improving the sparseness and robustness of the algorithm and further selecting more effective pathogenic genes.The main contributions of this paper are outlined as below:?1?A JELSR model based on a joint constraint is proposed?LJELSR?.TheL1-norm and the L2,1-norm are simultaneously applied on the original model to form a joint constraint,which enhances the correlation between rows and columns of matrix,and improves the sparsity of the algorithm.Based on the characteristics of the model,a new iterative algorithm is given to obtain the convergent solution.The experimental results show that LJELSR has a good effect compared with the previous methods in identifying differentially expressed genes and sample clustering,and the unique differentially expressed genes may have important value in medical research.?2?A JELSR model based on anLp-norm is proposed?RJELSR?.TheLp-norm constraint is introduced to replace the original square constraint,which reduces the sensitivity of the algorithm to noise and outliers,making the algorithm more robust.Based on the augmented Lagrange multiplier method,an effective optimization strategy is given.Different cancer data are preprocessed to obtain integrated data,and then the new method is applied to the integrated data for feature selection and the cluster analysis,which makes the characteristic genes selected by the experiment more biologically significant.?3?A multi-view joint sparse low-rank regression and embedding learning model is proposed?MJSLRE?.The multi-view model fully considers different types of cancer information.And the sparse low-rank regression constraint is introduced into the objective function,which preserves the inherent structure of the data and improves the learning efficiency of the subspace and the robustness of the algorithm.The experimental results show that the MJSLRE method unearths more oncogenic genes with different medical reference values in different multi-view genomic data,and the effect of cluster analysis is better than other control methods.
Keywords/Search Tags:Cancer omics data, Feature selection, Clustering, Norm constraint, Multi-view model, Sparse low-rank regression
PDF Full Text Request
Related items