Font Size: a A A

Research And Application Of Named Entity Recognition For Chinese Social Texts

Posted on:2023-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:W T ZhuFull Text:PDF
GTID:2558306911484134Subject:Engineering
Abstract/Summary:PDF Full Text Request
Internet social networking platforms have become the most important channel for people to communicate and transmit information,generating and disseminating huge amounts of social data every moment.How to better mine useful information from these social text data to assist important social management work such as online opinion monitoring and electronic data forensics is gradually becoming a research focus in the field of natural language processing.Named entity recognition is an important fundamental task of natural language processing,which directly affects tasks such as downstream information extraction.Therefore,it is important to perform named entity recognition on chinese social texts.There mainly exist several problems in executing the task of named entity recognition of chinese social texts,such as lack of corpus,poor text standardization,and poor applicability of generic models.To address these existing problems,the main work of this thesis is as follows.(1)To address the problem of lack of corpus and poor recognition performance of generic domain models on chinese social texts,this thesis constructs a normalized named entity recognition dataset covering multiple entity types,and selects the IDCNN-CRF model,which is currently more advanced for generic named entity recognition tasks.To address the problem of missing local semantic information,this thesis proposes a named entity recognition model based on multiple convolutional neural networks,introduces CNN for local semantic information extraction,learns the correlation between features through the attention mechanism,and applies it to the chinese social text named entity recognition task.Experiments show that the improved model also performs well on the chinese social text dataset,satisfying the recognition requirements for chinese social texts,and the accuracy on the public evaluation dataset is also improved compared with the current more advanced models.(2)To address the problem of chinese social texts being heavily colloquial and poorly normative,this thesis conceptualizes a word embedding method that fuses radical features and word features based on the character structure characteristics of chinese characters.Several comparative experiments show that the performance of the named entity recognition model using a multi-feature fusion method for word embedding representation is significantly improved on both chinese social text datasets and public evaluation datasets compared with single features.The method is also applied to the named entity recognition model based on multiple convolutional neural networks,and experimental comparisons show that the multiple convolutional neural network model using the multi-feature fusion method has significantly improved accuracy compared to the single-feature model,and has effectively improved the recognition performance of several types of entities that are poorly recognized on the chinese social text dataset.(3)Finally,this thesis designs and develops a named entity recognition system based on the Flask framework by implementing the improved named entity recognition algorithm model into a practical application scenario with unstructured chinese social text data as the object.In this thesis,we analyze the requirements and decouple the functions of the system,give specific details of the named entity recognition functions,show the final implementation of the system,and verify the system functions by writing test cases.
Keywords/Search Tags:Named Entity Recognition, Chinese Social Texts, Convolutional Neural Network, Attention Mechanism, Radical Feature
PDF Full Text Request
Related items