Font Size: a A A

The Biological Data Classification Based On Neural Networks And Support Vector Machine

Posted on:2018-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:X BaiFull Text:PDF
GTID:2310330536460966Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Over the past few years,artificial intelligence has playing an irreplaceable role in the area of biological sciences and Medicine and so on.Computer techniques for analysis has been the most widely used,and the focus of this article,is within sequence analysis.There are still many important issues need to address the clear in this direction.With the development of DNA direct sequencing,the number of proteins and DNA sequences are increasing exponentially.The application of the machine learning methods in sequence analysis are promoted by the protein structure prediction.Protein secondary structure prediction in this paper has conducted the research under this background.Aim of this study is to predict the unknown structure protein.We have designed the integrated network.It has better performance in the prediction of C-coil structure,and the performance in the prediction of the other structures is just so-so.Besides introduction the application of neural network and SVM in Protein secondary structure prediction,early breast cancer screening based on these algorithms was introduced in this paper.Several evaluation indexes of healthy people and breast cancer patients have been used to make diagnoses.Breast cancer is occurred in the mammary gland epithelial malignant tumors.To establish a simple,accurate and rapid method for screening of the breast cancer must be meaningful in clinical perspective.A new method model based on ANN(artificial neural network)and SVM(support vector machine)to distinguish BC and non-BC has been proposed.In this method,PCA is used to simplify the data at first and then one of the intelligent algorithms used to classify the compact data.It is can achieve the aim of predicting at last.258 newly diagnosed BC patients and 159 benign mammary gland disease control patients,which contain 78 healthy people,were included.The focused metabolomics of blood spot targeted analytes included 23 amino acids and 26 acylcarnitines.Tested with the subset of BC and non-BC samples,this the model showed the highest sensitivity of 97.1% and specificity of 93.9% by NN and showed the highest sensitivity of 93.5% and specificity of 93.8% by SVM.The accuracy of the two algorithms are 91.5% and 93.6%,respectively.The characteristics of two algorithms are summarized with experimental results.Both of these solutions have their merits.The samples in the training set are equal or slightly more than those in the testing set,the ANN make a better performance.When the samples in the training set are significantly more than those in the testing set,the SVM make a better performance.The sensitivity of and specificity of the traditional protein markers in [42] is 92.2% and 84.4%,respectively.Compared to the protein markers method in the references,our model has its unique advantage and higher accuracy for the prognosis and dignosis of breast cancer.
Keywords/Search Tags:Protein structure, Breast cancer screening, Principal component analysis(PCA), Machine Learning, metabonomics
PDF Full Text Request
Related items