Font Size: a A A

Research On Neuropeptides Extraction Based On Text Mining

Posted on:2016-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:R M DiFull Text:PDF
GTID:2348330479454327Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Neuropeptides are signaling molecules that used by neurons to communicate with each other. They are involved in a wide range of brain functions,including analgesia,reward,food intake,metabolism,reproduction,social behaviors,learning and memory.In this article,we propose a hybrid approach to extract neuropeptide's name-sequence pairs base on text mining. The hybrid approach combine shallow parsing in natural language processing, pattern searching and machine learning. First, we recognize neuropeptide sequence in the text base on rules and dictionaries. Then,the Stanford Parser is used to parse the sentence to get the information of sentence, including part of speech(POS) tags, phrase structures and word dependence. POS tags and word dependence is used to recognize the nouns that match the particular form between neuropeptide name and sequence,and phrase structures is used to return the noun phrase which is the candidate neuropeptide name. Apart form shallow parsing, we also use patterns to recognize the neuropeptide name since the name-sequence pair comes in particular form sometimes. In the end,base on the results of shallow parsing,we pick up12 features between sequence and all the nouns in the sentence,and KStar classifier is attached to classify the pairs to filter the possible neuropeptide name-sequence pairs.We evaluate the approaches on manually checked data set,in the goal of recognize peptide sequence,the F-score reaches 99.6%. For the approach based on shallow parsing,pattern searching and machine learning,the F-score reaches 55.5%,55.9 and 53.8. For the hybrid approach based,the F-score reaches 61.0%. The results implies that the method is capable for recognize neuropeptide names and sequences in biomedical text.
Keywords/Search Tags:text mining, neuropeptide, shallow parsing, pattern searching
PDF Full Text Request
Related items