Font Size: a A A

The Research Of Biomedical Event Extraction Based On Semi-Supervised

Posted on:2014-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Q XuFull Text:PDF
GTID:2248330398450479Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With growth of bio medical literature, extract text information automatically has become an urgent need of medical experts. How can we get structured information from the vast amounts of unstructured information has become a hot area of research. Information extraction technology such as the named entity recognition, event extraction, has been greatly developed in these years. Biomedical events occur at the molecular level. Changes and relationships between the proteins can be found in event extraction.This article aims to research the biomedical literature event extraction. We follow the classic process flow:preprocessing, trigger detection, events argument detection, post-processing. We focus on the trigger detection and events argument detection. In trigger detection, we can learn from the named entity recognition. Firstly we create a dictionary of candidate word. Then we explore a variety of effective features from the labeled and unlabeled data to train a model. Using this model, we determine whether a candidate word is the trigger word. In events argument detection, we treat simple argument and nested argument of event respectively. The simple arguments of the event are detected from the protein-trigger candidate pairs. The nested arguments of the event are detected from the event trigger-trigger candidate pairs. Because labeled corpus is limited, the problem of sparseness is serious. We use the semi-supervised learning method for biomedical event element detection model to avoid this problem. This method uses some features, which is sparse in the labeled data but have a strong ability of classification. Using the co-occurrence with some special features in unlabeled corpus, we can calculate the weight of new features according to some algorithm.In the BioNLP2011corpus, we explore some effective features from the training data and unlabeled data such as PubMed, to build an event extraction model. This semi-supervised method obtains a good extraction result which improved significantly in the extraction of simple events.
Keywords/Search Tags:Biomedical, Event Extraction, Semi-supervised, Feature
PDF Full Text Request
Related items