Font Size: a A A

Text Mining Techniques Based On Protein-protein Interaction Prediction Method

Posted on:2010-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:J HuangFull Text:PDF
GTID:2120360278470447Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Post-genomic era, with the high-throughput development of biotechnology, biomedical research tools and experimental methods have undergone tremendous changes in the amount of biomedical literature was "exponential" growth in the application of text mining technology from the mass of medical literature protein extract relevant information, to establish the relationship between protein-protein interaction network has become a bioinformatics and proteomics research in the field of hot spots.This article introduces the text mining technology and its applications, and then introduced the text mining technology in the field of bio-medical applications. On support vector machine (SVM) algorithm is done on the basic principles, on the basis of our research, a support vector machine based on the protein-protein interaction prediction algorithm is the first application of support vector machine algorithm for extracting protein names of the literature information, and the introduction of context clues to improve the performance of extraction algorithm, the experiment proved that the introduction of context clues to make extraction of the three evaluation results have shown a marked increase. Then we select the interactive features of the word, the word of the characteristics, physical characteristics and distance characteristics of the link grammar as a feature vector, the application of support vector machine algorithm to predict protein-protein interaction, the relationship between the loss less the rate of extraction recall performance, the larger and improved accuracy and ultimately improve the rate of a comprehensive classification.In the proposed algorithm based on distributed computing systems using bioinformatics to solve large-scale data-processing calculation. The system is broken down into several sub-tasks. Sub-tasks will be allocated to the network client computer to complete the other, thus to some extent increased mass Bioinformatics computing efficiency.
Keywords/Search Tags:Text mining, protein interaction, support vector machines, distributed systems
PDF Full Text Request
Related items