Font Size: a A A

Research And Implementation Of Bio-Text Events Extraction Based On BIONLP'09

Posted on:2016-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z M FuFull Text:PDF
GTID:2348330485451937Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of modern civilization, the number of bio-text grows exponentially, how to get relevant biological knowledge from the bio-text mass has become a hot research issue. The information in the literature lacks strict organization and management structure, if we can turn it into a certain collection of organized structure, access to information will be facilitated. In this context, information extraction technology emerged. Information extraction and mining technology in recent years has improved significantly, and the event extraction technology is of an excellent and advanced level.Event extraction technology is an information extraction technology, and this paper mainly studies bio-text event extraction technology. Based on BIONLP'09 share task, and targeting at unstructured bio-text, this paper illustrates the use of some existing tools and methods for analysis and processing, identification of proteins entity, identification of clues word(also called trigger word), and then using some methods and rules to identify their relationship, and determine their relationship types, in order to get the relative pair here referred to as event extraction. The entire experimental process can be roughly divided into pre-processing module, Parsing syntactic parsing process, rules analysis and relations extraction module and results processing module. This study focuses primarily on rules analysis and relations extraction module and results processing module. The main process of pre-processing module includes a unification of the text format,replacement of all protein, and initial replacement of all the clue words. Rule analysis and relationship extraction module mainly includes the push on process, replacement of all the clue words, and parsing with the help of rules etc. Results processing module mainly includes operation such as duplicate removal, reordering, decomposition, and back-replacement.The experimental results indicate that with the processing methods and procedures in this study, the recall rate of event, accuracy and F value basically reach 50%, indicating good event extraction results.
Keywords/Search Tags:BIONLP'09 share task, bio-text, Protein, Trigger words, Relation, Events extraction
PDF Full Text Request
Related items