Font Size: a A A

The Research Of Fusion Methods For Hedge Detection

Posted on:2013-01-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:H W ZhouFull Text:PDF
GTID:1118330371496682Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
To distinguish factual and uncertain information in biological texts, hedged information detection is extremely important for biomedical information extraction, which avoids extracting speculative information as factual information.As large-scale tagged Bioscope corpus has become available these days, studies in detecting hedge scope have been developed.However, the performance for hedge scope detection is still less than60%.There is a considerable gap between academic researches and practical applications.Hedge scope detection is rather complicated as it falls within the scope of semantic analysis of sentences exploiting syntactic patterns. For complicated hedge scope detection task, there exists no reliable and simple way to achieve a satisfactory performance.Every kinds of feature, every method, every model has its advantages and limitations, and they are complement for each other. So how to combine the advantages of various kinds of features, methods, models, and avoid one-sidedness of a single model to develop high-accurate fusion hedge detection systems, become an important theme of natural language processing.This paper focuses on the fusion methods for hedge detection. The main works are listed as follows:1.The approach to hedge scope detection using a composite kernel which combines structured and flat features.Four phrase-based structured features over a parse tree are explored for hedge scope learning to capture the critical syntactic structure by the convolution tree kernel.The convolution tree kernel that exploits the syntactic structured features and the polynomial kernel that exploits the flat features are combined into a composite kernel.The composite kernel outperforms either of the two individual kernels.2.The hybrid approach based on rules and statistics to hedge scope detection, which can also combine phrase structures and dependency structures.First, phrase structures and dependency structures are used for hedge scope detection respectively.Phrase structures are adapted as important features for hedge scope detection by a Support Vector Machine (SVM)-based model.Dependency structures are used to detect hedge scope by a rule-based method. Then, the phrase-based system and the dependency-based system are combined by a Conditional Random Field (CRF)-based model, which simply extends the feature vectors with the scope tags generated by the two individual phrase-based and dependency-based systems. The combination of rule-based and statistics-based approaches,the combination of phrase structures and dependency structures,and the combination of SVM and CRF in our fusion system are all factors for effective scope detection. Experimental results show that phrase structures and dependency structures are both effective for hedge scope detection and their combination can improve the scope detection performance further.3.The voting technique for detecting hedge scope.First we construct eight classifiers based on CRF,SVM, Max-Margin Markov Network (M3N) and our rule-based and statistics-based combination approach, time two directions (forward and backward).Then three different voting schemes:(1)majority voting;(2) weighted voting by the accuracy of the component classifier;(3)POS weighted voting by the accuracy of the component classifier on all tokens which have the same POS,are adapted to voting-based hedge scope detection. The experimental results show that voting may result in improvement over their component classifiers by combining their individual advantages.This paper explores the fusion methods to hedge detection, including the combination of structured and flat features,the hybrid approach based on rule-based and statistics-based approaches, the method of multiple classifier fusion.The major contributions of this paper lie on the proposal of a phrase-based approach to hedge scope detection using a composite kernel which combines structured and flat features;the proposal of the hybrid approach based on rules and statistics to hedge scope detection, which can also combine phrase structures and dependency structures;the proposal of the voting scheme to detect hedge scope which combines many individual classifiers to exploit the unique advantage of each classifier. This work improves the hedge detection performance significantly, and exhibits reference value to the future research in fusion methods to natural language processing.
Keywords/Search Tags:Natural Language Processing, Hedge Scope Detection, Fusion, CompositeKernel, Multiple Classifier Systems
PDF Full Text Request
Related items