Font Size: a A A

Research Of Sequence Labeling Model Based On Fine-grained Word Representations

Posted on:2019-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:G H LinFull Text:PDF
GTID:2428330566484201Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Sequence labeling task,whose performance has a highly marked impact on following tasks such as machine translation,community question answering,is a kind of basic task in nature language processing.Traditional statistical models achieve state-of-the-art performance via feature engineering,but the design often requires a lot of resources and the features designed are of poor portability;some neural networks introducing pretrained word embeddings which integrate with syntactic and semantic information,achieve close to state-of-the-art results through a series of non-linear transformations,but lack of morphological information of words result in the magnification of the unknown word problem.Extensive research and analyses on the morphological features find that manually-design features lack of portability and character-level models of neural networks such as BiLSTM and CNN have some limitations which include either overabundance of parameters and difficulty in parallelizable computation or only capturing local features.Based on these findings,we proposed a character-level representation model based on attention mechanism,namely Finger,for intra-word information encoding.Independent of LSTM and CNN,the model captures both long-range and local dependencies in input sequence,integrated with the merits of the vanilla attention mechanism such as highly parallelizable matrix computation and global dependencies,and fewer parameters.In addition,we propose a neural network architecture,namely Finger-BiLSTM-CRF,for sequence labeling,which consist of character-level model Finger for learning morphological information of the tokens for the decision of boundary,Bi LSTM for modelling context information of each word and linear-chain CRF for joint decoding labels for the whole sequence,to meet the reuse for other tasks.To verify the effectiveness,we apply our model to Penn Treebank WSJ and English data from the CoNLL 2003 shared task,finally our model achieves close to state-of-the-art results in the end-to-end situation,obtaining 97.37% accuracy for part-of-speech tagging and 91.09% F1 for named entity recognition.
Keywords/Search Tags:Sequence Labeling Task, End-to-end Model, Character-level Word Representation Model, Attention Mechanism
PDF Full Text Request
Related items