Feature Extraction Method For Gene Expression Profiles Mining

Posted on:2016-08-18

Degree:Master

Type:Thesis

Country:China

Candidate:T L Yao

Full Text:PDF

GTID:2284330461992495

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

With the rapid development of new molecular biology techniques and DNA microarray technology, we can quantitatively measure the expression levels of thousands genes from biological samples, and gene expression data generated by this technique can reveal implicit and previously unknown biological knowledge. In recent years, researchers used the techniques of statistics and pattern recognition to analysis the microarray gene expression data and effectively excavate the pathogenic tumor genes, so that we can make a correct diagnosis and classification prediction on the tumor types. However, tumor gene expression data are of high-dimensional characteristics with small sample sizes, before the analysis of tumor data, the traditional data processing methods generally project high-dimensional data into a low-dimensional subspace, which not only ensure the accuracy of classification and recognition, but also improve the performance and the computational efficiency of the learning method.By combining the knowledge of bioinformatics and pattern recognition, the dominant feature sub-sets are extracted from the tumor data with the characteristic of high-dimensionality and small-size, and we conduct an effectiveness analysis on the corresponding experimental results. The main contributions are summarized as follows:1. A feature selection algorithm is proposed based on the property of submodularity. First, taking the genetic correlation characteristic of the tumor gene expression data into account, the individual gene attribute is converted into a adjacency graph with structural information; secondly, a feature selection objective function with submodularity is constructed for the obtained adjacency matrix, and then a greedy algorithm is used to extract the feature sub-set; finally, the KNN and the SVM classifier are used to achieve classification and recognition of the selected feature subset of testing samples, and the experimental results illustrate the effectiveness of this method.2. To address high-dimensionality and small-size of tumor gene expression data, a feature selection method is applied via locality preserving projections (LPP). This method firstly use principal component analysis (PCA) to remove noise and reduce the dimension of the original data, and preserve 99% principal components of the processed data to characterize the original data, then we use the LPP to reduce dimension as well as preserving local information, and finally we used KNN and SVM classifier to classify tumor data effectively. In order to demonstrate the effectiveness of this method, we used three groups of the real data sets to conduct experiment and analysis the experimental results.

Keywords/Search Tags:

Gene expression profiling, Submodularity, Feature extraction, Locality preserving projections

PDF Full Text Request

Related items

1	Research On Feature Extraction And Classification In Gene Expression Profiling
2	The Discrimination And Subtype Identification Of Colorectal Cancer Based On Gene Expression Profiling
3	Gene Feature Extraction Based On Multi-regularized Constraints And Low-rank Matrix Factor
4	Research Of The Method For Extracting The Tumor Gene Expression Profilesâ€™ Informative Gene
5	Gene-expression profiling of lymphoid malignancies: Identification of deregulated pathways and response prediction to therapy
6	Study The Effect On Expression Profiling Of Hepatocelluar Carcinoma (HCC) By Trichostatin A, A HDAC Inhibitor
7	1.A Study On Expression Signature Prognostic For Survival Of Lung Adenocarcinoma From MRNA Profiling In Human Lung Development 2.Power Of Deep Sequencing And Agilent Microarray For Gene Expression Profiling Study
8	The Research Of The Theory Of Matrix Decomposition In Gene Expression Profiling
9	Research On Feature Extraction And Selection Algorithm Of Emotional EEG
10	InvestigatingHIV-1DNAvaccineadjuvant And Gene Expression Profiling Of Two Viral Vectors