Research On Visual Dimensionality Reduction And Semi-supervised Classification Methods Of Key Proteins In Down Syndrome Mice

Posted on:2022-06-09

Degree:Master

Type:Thesis

Country:China

Candidate:X Chen

Full Text:PDF

GTID:2544306323970959

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Down Syndrome(DS)is a relatively common chromosomal disease at present,mainly due to genetic aberrations caused by the extra duplication of human chromosome 21,which affects the normal expression of proteins and causes the loss of normal functions such as learning and memory in DS patients.At present,DS occupies a high incidence in newborns and there is no effective drug treatment method.Therefore,exploring the protein expression related to DS has important guiding significance for finding effective drug targets and seeking effective directions for drug treatment.This thesis focuses on the public mouse protein expression data set.The main contents of the thesis are as follows:(1)Data preprocessing and key protein extraction of mouse protein.The mouse protein data studied in this thesis are filled with missing values,and the data range is normalized by the Min-Max standardization method.The Mann-Whitney U test method is used to compare the two pairs of groups in the normal mouse group,the trisomy mouse group,and between normal mice and trisomy mice,obtaining key proteins with significant differences in expression levels under different stimulation conditions.And the significance level is corrected by the Bonferroni correction method to eliminate false positives in multiple comparison experiments.(2)Combining extremely randomized trees(ET)and t-SNE method,this thesis proposes the ET-tSNE visual dimensionality reduction algorithm for high-dimensional protein data.In view of the difficulty in understanding the distribution structure of the high-dimensional data and the corresponding internal connections,the thesis considers using dimensionality reduction methods to visualize the high-dimensional protein data and enhances the interpretability of the high-dimensional data.Compared with other dimensionality reduction methods,the proposed ET-tSNE algorithm in this thesis has achieved better visualization results,and further explores the biological significance of protein data in two-dimensional space,meantime verifies the correctness of the extracted key proteins,and has better performance.(3)The thesis proposes a semi-supervised stacking ensemble learning classification(SSSELC)algorithm for the key protein classification.For scenarios where labels are scarce,such as computer-aided diagnosis and drug discovery,semi-supervised learning methods are more suitable.Meanwhile,the ensemble learning algorithm is introduced to further improve the effects of classification,which is complementary to the semisupervised method.Therefore,the proposed SSSELC algorithm in this thesis combines semi-supervised learning and Stacking ensemble learning model.Compared with other methods,the classification performance of the SSSELC algorithm has been significantly improved,and it has also achieved better experimental results when applied to the problem of multi-classified data.

Keywords/Search Tags:

Mouse Protein, Visual Dimensionality Reduction, Semi-supervised Learning, Ensemble Learning

PDF Full Text Request

Related items

1	Research On Assistant Diagnosis Of Breast Cancer Combining Cost-Sensitive And Semi-supervised Learning
2	Automatic Detection Of Epileptic EEG Signals Based On Semi-supervised Learning
3	Epilepsy EEG Diagnosis Classification Based On Ensemble Learning Algorithm
4	Automatic Segmentation Of Vessels In Retinal OCTA Images Based On Semi-Supervised Learning
5	The Research On Drug Target Protein Prediction Algorithm Based On Semi-supervised Learning
6	Study On Automatic Analysis Methods For Digital Pathology Images Based On Semi-Supervised Deep Learning
7	Semi-supervised Extreme Learning Machine Algorithm For EEG Classification
8	WBC Analysis Under Incomplete Supervised Learning
9	Research On Arrhythmia Classification Algorithm Based On Integrated Deep Learning
10	Automatic Classification And Recognition Of Peripheral Blood Leukocytes Based On Semi-supervised Learning