Font Size: a A A

Event Text Feature Extracting System Based On Neural Machine Translation

Posted on:2019-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:T Q HuangFull Text:PDF
GTID:2348330545958535Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Along with the developing of the Internet,information carried by text is also increasing in a speed never seen before.However,the generating speed of information on the Internet has far exceeded the speed of accepting information of an average person.The need of providing better Internet services creates the field of natural language processing.The target of which is to make machines processing information from text and speech as human beings.Information extraction is one of the main fields in NLP and being one of the common categories of information,events has become an important task in the field.The main purpose of event feature extraction is to extract various features from the text which described it,i.e.the event type,location etc.Extracting the feature of events can dramatically reduce the time of understanding one event,as well as making the search engines providing results that has better correlation and quality.Recently,the continuously developing of deep learning has bring an evolution to many domains include NLP.Because of the word embedding vector produced by word2vec being found of containing relative semantic information,deep learning NLP algorithms have been considered to use semantic information during text process while the traditional NLP algorithms cannot,and also,highly rely on the information provided manually.When such information is inaccurate or insufficient,these algorithms often generates bad results.Nowadays,deep neural networks have been capable of many NLP tasks including machine translation,Part-of-speech tagging,QA systems etc.Neural Machine Translation(NMT)model is a novel machine translation model based on deep learning.It has been applied to tasks such as machine translation,text generation and semantic information extraction.This paper focus on Chinese event feature extraction,making use of the novel neural machine translation model to complete Chinese segmentation,then applied an improved Bi-LSTM-CRF model to accomplish the event text feature extraction task through named entity recognition.The main content of this paper includes:(1)Exploit NMT model on Chinese segmentation based on semantic informationBecause of the difference between character-based language like Chinese and alphabetic language such as English,when exploiting neural NLP models on Chinese,the sentences should be segmented as one morpheme per vocabulary.This paper designs a new segmentation model based neural machine translation model.(2)Introduce an improved Bi-LSTM-CRF model for event text feature extractionThis paper completes the event text feature extraction task through extracting specific phrases describing event features in a certain text.The method of extracting specific part from text is also known as named entity recognition(NER).Although traditional algorithms can provide preferable results,they largely rely on features provided manually,result in insatisfying results when features provided cannot provide enough information of the entity boundaries.On the other hand,neural networks doesn't rely on artificial features,but they can't use the historical information because the model only generates one output at a time.Bi-LSTM-CRF model is a model that can combine the advantages of both traditional algorithms and neural networks.This paper further improves the model and applies BIOES tagging to the task,as a result improves F1-score on CoNLL-2003 NER corpus.
Keywords/Search Tags:Event feature extraction, neural machine translation, named entity recognition
PDF Full Text Request
Related items