Research On Dimensionality Reduction In Medical Big Data

Posted on:2021-03-03

Degree:Master

Type:Thesis

Country:China

Candidate:A N Yu

Full Text:PDF

GTID:2504306512987599

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

As the entrance of big data era,gigantic volumes of data are generated at an unprecedented rate.Such data has not only a huge sample size,but a considerable feature size as well.There is no exception in medical field,and examples in point are microarray data which contains thousands of genetic probes and high-resolution medical image data,including X-rays and MRI images.The aforementioned high-dimensional data inevitably contains redundant features,which poses severe challenges to the learning of classification and clustering algorithms.Therefore,this dissertation focuses on the main topic of "dimensionality reduction in medical big data",and mainly proposes the following three innovative methods for feature selection and extraction according to different classification or clustering tasks:(1)A supervised global mutual information based feature selection method.The previous methods are based on maximizing the mutual information between features and labels,and search in a heuristic and greedy manner.The results of feature selection are easily affected by the former selected features.We model the goal of relevance-maximization as a quadratic programming problem,and consider the redundancy between features in the meantime,in order to find a globally optimal feature subset.(2)A supervised dimensionality reduction method based on l2,1-norm.Considering that linear discriminant analysis can effectively reveal the global intra-class and inter-class discriminant information,and the Laplacian matrix can reflect the local"smoothness" of the data samples.We combine the above two concept into the same framework,using an l2,1-norm to ensure the row sparsity of feature selection.The objective is to find a low-dimensional linear transformation such that the global discriminative information is best extracted and the local geometry structure is optimally preserved.(3)An adaptive unsupervised feature selection method.Instead of constructing a similarity matrix using K nearest neighbors and RBF kernel functions just as the same as previous works,we allocate the adaptive neighbor nodes for each data point according to the local distance.Then feature selection is embedded into the clustering process to ensure that the data manifold structure are well preserved.Sparse learning is utilized to ensure the efficiency of the feature selection algorithm.

Keywords/Search Tags:

Feature Selection and Extraction, Information Theory, Linear Discriminative Analysis(LDA), Spectral Graph

PDF Full Text Request

Related items

1	Research On Uncertain Graph Mining And Feature Selection For Brain Network
2	Prediction Of Linear B Cell Epitopes Based On Feature Selection
3	Fundamental Theory And Application Study On Large For Gestational Age Infants Using Machine Learning Techniques
4	Research On EEG Feature Selection And Feature Extraction Algorithm Based On Motor Imagery
5	Research On Feature Extraction And Selection Algorithm Of Emotional EEG
6	Simulation Of Spatiotemporal Pattern Embedded EEG And Its Analysis
7	Research On Feature Extraction And Classification Of Neurons Based On Multivariate Statistical Analysis
8	The Methods Of Detecting Biomarkers Of Hepatocellular Carcinoma Based On Feature Selection
9	Research On Feature Selection Of Tumor Genes
10	EEG-based Emotion Recognition Using Spectral Graph Convolutional Network