Font Size: a A A

The Research On Micro-blog Named Entity Recognition Based On Deep Learning

Posted on:2021-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ShiFull Text:PDF
GTID:2428330611980634Subject:Computer science and technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of the Internet and the popularity of mobile smart terminals,more and more people have become accustomed to publishing and disseminating information through Weibo platforms,and the micro-blog text information has shown explosive growth.Obtaining effective information from massive Weibo texts has attracted more and more researchers' attention.Researches such as relationship extraction,event tracking,and public opinion analysis based on micro-blog texts have emerged at the historic moment.Named entity recognition is particularly important as the basic work of these studies.Therefore,the rapid and effective recognition of named entities from the huge amount of micro-blog text information has become a new research hotspot in the field of natural language processing.Aiming at the characteristics of flexible and changeable micro-blog texts,nonstandard linguistic logic,few open corpora,and lack of word boundary features,this paper improves the existing model by introducing part-of-speech features and multitask learning.The following two named entity recognition models are proposed:(1)Grammatical mistakes and phrases but not sentences used in micro-blog texts affect greatly the performance of the existing Named Entity Recognition(NER)algorithm designed for regular and long sentence texts.Aiming at the problem,this paper proposes a named entity recognition model for micro-blog texts that takes parts of speech of words as its information.Aiming at the feature that the named entities in Weibo text are mostly noun parts-of-speech,the model selects the word segmentation tool Py NLPIR to extract the part-of-speech information of the text,and uses the partof-speech information as features and word embedding vectors to input the Bi LSTMCRF model to achieve Improvements to named entity recognition models.The experimental results show that the NER model of the fusion of part-of-speech information significantly improves the accuracy of NER.(2)Aiming at the characteristics of weakly labeled corpus and weak recognition of word boundaries in Weibo text,this paper designs a named entity recognition model based on multi-task learning.The model trains the word segmentation task and named entity recognition task together.Through the bottom Bi LSTM layer and parameter sharing,the feature representation sharing between tasks is realized,which alleviates the problem of lack of corpus in named entity recognition task.Using the eavesdropping mechanism of multi-task learning to improve the generalization ability of the model.Through multiple sets of comparative experiments,it is proved that the named entity recognition model based on multi-task learning learns the part-of-speech and word boundary information in the word segmentation task well,and effectively improves the named entity recognition effect.Finally,the two prototypes of the proposed Weibo text named entity recognition model are implemented,and the test results are displayed visually.
Keywords/Search Tags:NER, micro-blog text, POS, multi-task learning
PDF Full Text Request
Related items