Research On Fast Exact Structured Learning

Posted on:2011-05-19

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X Qian

Full Text:PDF

GTID:1118360305997279

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Structured learning models owe a great part of their success to the ability in using structured information. However, these methods are more time consuming than non-structured learning models. Though approximate algorithms reduce the computational complexity, they degrade the accuracy to some extent. Therefore, exploring fast extract algorithms has important role in structured learning.In this paper, we improve two aspects of structured learning:inference and feature extraction. First, for sequence labeling tasks, we proposed sparse higher order Conditional Random Fields(SHO-CRFs) based on the characteristics of sparseness of higher order features in many real applications, together with a novel extract tractable inference algorithm which is able to deal with local and sparse higher order features. SHO-CRFs are practically very efficient due to fea-ture sparseness. In optical character recognition task, we use word affixes as higher order features, SHO-CRFs achieve the highest reported accuracy. In Chi-nese organization name recognition task, we achieve the second highest F1 score on Microsoft Research Asia corpus. Both experimental results show that, with the same feature set, SHO-CRFs significantly outperform other state of the art methods.Second, we propose a novel feature string indexing structure for fast feature extraction to reduce decoding time. Many modern structured learning methods adopt templates to generate millions features. Complicated templates bring about abundant features which lead to higher accuracy but more feature extraction time. We proposed two-dimensional Trie(2D Trie), a novel data structure which takes advantages of relationship between templates for fast feature extraction:feature strings generated by a template are prefixes of the features from its extended templates. Experimental results on Chinese Tree Bank 6 corpus show that,2D Trie is about 5 times faster than traditional Trie, leading parsing time 4.3 times faster.

Keywords/Search Tags:

Structured learning, Conditional Random Fields, Sparse Higher Order Conditional Random Fields, 2D Trie, sequence labeling, dependency parsing

PDF Full Text Request

Related items

1	Multi-Task Learning In Conditional Random Fields For Chunking In Shallow Semantic Parsing
2	Research On Dependency-based Chinese Semantic Role Labeling
3	The Research Of Applying Conditional Random Fields To Chinese Lexical Analysis And Chunk Parsing
4	An Self-adaptive BLP Optimal Model Employing Conditional Random Fields
5	The Research On Conditional Random Fields Models For Image Labelling
6	SAR Image Change Detection Based On Conditional Random Fields
7	Research On Chinese Syntactic Parsing Based On Cascaded Conditional Random Fields
8	Research On Online Detection Method Of Reputation Fraud Campaign Based On Conditional Random Fields
9	Research Of Chinese Phrase Identification Based On Conditional Random Fields
10	Research Of Multiview Sequence Data Modeling Based On Conditional Random Fields