Font Size: a A A

Research Of Sequence Labeling Technics Based On Graph Models

Posted on:2020-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y N ShaoFull Text:PDF
GTID:2428330590974190Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,information technology and data science have developed rapidly and have been widely used in various applications.Among these,natural language processing technology is a key technology.Natural language processing technology is a technique for understanding complex language discourse.With the development of science,natural language processing technology has become one of the most important technologies in the information age and has received extensive attention for decades.Researchers in this area have conducted a lot of research work.Natural language processing includes many sub-directions,such as machine translation,which is a task of automatically translating a source language to a target language,making it easier for users to read and understand.The goal of a QA system is to establish an algorithmic system that can understand human language and interact with humans.QA system can help users solve many specific problems with low labor costs.In this paper,we focus on the task of sequence labeling,it is the task of building a system which can automatically analyzing a sequence/text by assigning categorical labels to each part of the given sequence/text.Sequence labeling is useful to help computer understand natural language.In this thesis,we first propose a hierarchical attention neural semi-Markov conditional random fieldsmodel.Our model uses hierarchical structure to incorporate character-level and word-level information,and applies attention mechanism to both levels,enabling it to attend differentially to more or less important content when constructing the segmental representation.We evaluate our model on three sequence labeling tasks,and experimental results show that the proposed model benefited from the hierarchical structure,achieving competitive and robust performance on all three sequence labeling tasks.Then we study the influence of the encoding schema on sequence labeling.We design two latent variable conditional random fields model,which take encoding schema as latent variable.The first proposed modelis capable of automatically choose the best encoding schema for each given input sequence.The second model is capable of automatically choose the best encoding schema for each word,given a input sequence.We evaluate our proposed model on three sequence labeling tasks and experimental results show that the proposed model benefited from the latent variable,and achieving competitive and robust performance on all three sequence labeling tasks.At last,we focus on the task of mention extraction and classification and propose the neural encoded mention hyphergraph model.Our hyphergraph model is able to effectively capture overlapping mention entities withunbounded lengths.The proposed model combines mention hyphergraph model with encoding schema and neural networks,and is highly scalable,with a time complexity that is linear in the number of words in the input sentence and linear in the number of possible mention classes.Extensive experiments are conducted on standard datasets to demonstrate the effectiveness of our models.
Keywords/Search Tags:natural language processing technology, sequence labeling technology, hypergraph model, conditional random field, encoding schema
PDF Full Text Request
Related items