Font Size: a A A

Research And Application Of Named Entity Recognition Method For Dialogue Domain

Posted on:2020-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:C C YangFull Text:PDF
GTID:2428330602968357Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the improvement of network text information capacity,the rapid and efficient identification of the entity information from these texts is of great significance to all works of life.Named entity recognition(NER)is one of the basic tasks of natural language processing,and it has a very important impact on information extraction,text categorization,dialogue systems,and machine translation.Exploring NER methods efficiently has become a research hotspot in industry and academia.This work focuses on the English and Chinese named entity recognition.The main research contents are as follows:1)This thesis proposes a new contextual word state and sentence state representation model for English NER based on S-LSTM.Most previous NER models relied on BiLSTM to obtain words feature,which had obvious shortcomings.For example,the calculation of hidden states in BiLSTM depends on the previous states,and global exchange of information cannot be performed,and this limits the model's parallel computing efficiency.The S-LSTM models the implicit state of all words at each step,and performs local and global information exchange between words.Therefore,we propose a new contextual word state and sentence state representation model for English NER based on S-LSTM.It can enhance the global information representation for each word.We evaluate our models on CoNLL2003,PTB and CoNLL 2000 chunking datasets.Experimental results show that the performance of our proposed model is better than the baseline models.2)This thesis proposes an English and Chinese named entity recognition model based on sentence semantic and self-attention mechanism.Regardless of Chinese or English NER tasks,these models pay a little attention to the sentence semantic information and the global dependencies in a sentence.Therefore,we propose a neural network that integrates sentence representations into word representations for English and Chinese NER tasks.Besides,we also exploit the self-attention mechanism to capture the long distance dependence of any two words in the sentence directly,and better capture the global dependence of the whole sentence.Our model is not only suitable for the English dataset,but also the Chinese dataset.We evaluate our model on CoNLL2003 English dataset and Weibo Chinese dataset.Experimental results show that the performance of our proposed model is better than the baseline models.3)This thesis proposes a Chinese NER model that combining Pinyin embeddings and Wubi embeddings.Most previous Chinese NER models are simply based on the character embeddings,and neglect an important fact that Chinese characters contain both phonetic and structural features.Therefore,we propose a Chinese NER model that combining Pinyin embeddings and Wubi embeddings.Pinyin is highly relevant to semantic and provides extra phonetic and semantic information.In addition,plentiful Chinese characters are hieroglyphics,and Wubi provides additional structural features,which is helpful to find potential semantic relationships and word boundaries.We evaluate our model on Resume and Weibo Chinese datasets.Experimental results show that the performance of our proposed model is better than the baseline models.
Keywords/Search Tags:Name Entity Recognition, Contextual Word State, Sentence State Representation, Self-Attention Mechanism, Pinyin Embeddings, Wubi Embeddings
PDF Full Text Request
Related items