The Biological Data Classification Based On Neural Networks And Support Vector Machine

Posted on:2018-01-23

Degree:Master

Type:Thesis

Country:China

Candidate:X Bai

Full Text:PDF

GTID:2310330536460966

Subject:Computational Mathematics

Abstract/Summary:

Over the past few years,artificial intelligence has playing an irreplaceable role in the area of biological sciences and Medicine and so on.Computer techniques for analysis has been the most widely used,and the focus of this article,is within sequence analysis.There are still many important issues need to address the clear in this direction.With the development of DNA direct sequencing,the number of proteins and DNA sequences are increasing exponentially.The application of the machine learning methods in sequence analysis are promoted by the protein structure prediction.Protein secondary structure prediction in this paper has conducted the research under this background.Aim of this study is to predict the unknown structure protein.We have designed the integrated network.It has better performance in the prediction of C-coil structure,and the performance in the prediction of the other structures is just so-so.Besides introduction the application of neural network and SVM in Protein secondary structure prediction,early breast cancer screening based on these algorithms was introduced in this paper.Several evaluation indexes of healthy people and breast cancer patients have been used to make diagnoses.Breast cancer is occurred in the mammary gland epithelial malignant tumors.To establish a simple,accurate and rapid method for screening of the breast cancer must be meaningful in clinical perspective.A new method model based on ANN(artificial neural network)and SVM(support vector machine)to distinguish BC and non-BC has been proposed.In this method,PCA is used to simplify the data at first and then one of the intelligent algorithms used to classify the compact data.It is can achieve the aim of predicting at last.258 newly diagnosed BC patients and 159 benign mammary gland disease control patients,which contain 78 healthy people,were included.The focused metabolomics of blood spot targeted analytes included 23 amino acids and 26 acylcarnitines.Tested with the subset of BC and non-BC samples,this the model showed the highest sensitivity of 97.1% and specificity of 93.9% by NN and showed the highest sensitivity of 93.5% and specificity of 93.8% by SVM.The accuracy of the two algorithms are 91.5% and 93.6%,respectively.The characteristics of two algorithms are summarized with experimental results.Both of these solutions have their merits.The samples in the training set are equal or slightly more than those in the testing set,the ANN make a better performance.When the samples in the training set are significantly more than those in the testing set,the SVM make a better performance.The sensitivity of and specificity of the traditional protein markers in [42] is 92.2% and 84.4%,respectively.Compared to the protein markers method in the references,our model has its unique advantage and higher accuracy for the prognosis and dignosis of breast cancer.

Keywords/Search Tags:

Protein structure, Breast cancer screening, Principal component analysis(PCA), Machine Learning, metabonomics

Related items

1	Research On Principal Component Analysis Method And Its Application In Cancer Omics Data
2	Expression Of STAT1 In Breast Cancer,Bioinformatics Analysis And Experimental Study Of Biological Function
3	Screening Of LncRNA In Plasma Exosome Of Breast Cancer And Bioinformatics Analysis Of Candidate Molecular Markers
4	Screening And Functional Analysis Of CD40LG As A Prognostic Marker In Breast Cancer Based On Bioinformatics
5	Analysis And Functional Study Of Immune-Related Prognostic Factors Based On Public Database In Breast Cancer
6	Development Of A Clear Sky Hyperspectral Fast Radiative Transfer Algorithm Based On Principal Component Analysis And Machine Learnin
7	Screening The Differentially Expressed Gene UBE2C In Breast Cancer Based On Bioinformatics And Its Functional Verification
8	Screening And Bioinformatics Analysis Of Genes In Triple Negative Breast Cancer
9	Protein Secondary Structure Prediction Based On SVM
10	Classification And Analysis Of Metabonomics Data Based On Machine Learning