Font Size: a A A

Research On Named Entity Recognition Based On Word Information Relevance And Multiple Semantic Features

Posted on:2022-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ChenFull Text:PDF
GTID:2518306539963109Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The purpose of named entity recognition technology is to efficiently extract entity information from text information,which is the basic work of natural language processing.But the effect of named entity recognition is not good sometimes because of the influence of the unregistered words in the data,for example,causing the loss and omission of data information and affecting the relation extraction and other upper-level tasks of natural language processing.Therefore,in order to ensure the effective identification and extraction of entity information,the research work of named entity recognition becomes particularly important.Existing research mainly focuses on deep learning technology.The use of deep learning technology for named entity recognition tasks can establish a mapping relationship between input and output,obtain more low-dimensional and information-rich features from text data,and obtain the final output vector through the classifier,need not do a complicated characteristics of artificial engineering.Although the research on the named entity recognition technology based on deep learning has achieved many excellent results in recent years,there are also some shortcomings.For example,the model pays too much attention to local features while ignoring global features,cannot handle nonlinear complex data and the impact of unregistered words in the corpus.To this end,this paper proposes a named entity recognition method that integrates statistical learning and deep learning.It conducts research in two aspects: unknown words and multiple semantic information.The main innovations are as follows:(1)In view of the poor recognition effect of existing word segmentation tools when processing unregistered words,and the recognition time is longer and the complexity is high,a method to solve the problem of unregistered words in text data is proposed.This paper studies the formation characteristics of unregistered words,combines word formation probability information with double array Trie,and constructs a dynamic recognition model of unregistered words with mixed information double array Trie.It also recognizes the unregistered words in the corpus,and analyzes quantitative experiments to prove the effectiveness of the method,which improves the accuracy and speed of unregistered word recognition and reduces the space consumed.(2)In view of the fact that the basic deep learning model has a relatively single feature dimension and the input feature information is not comprehensive enough,this paper proposes a deep neural network named entity recognition method,which performs named entity recognition tasks by acquiring nonlinear and complex semantic features.This paper uses the BiLSTM model to learn the context feature vector,and feeds the character adjacency matrix and feature matrix to the GCN to obtain the global semantic feature vector,and constructs a named entity recognition model containing multi-dimensional semantic information.The experiment verifies that the model improves the naming.Through comparative experiments,the model improves the effect of named entity recognition,effectively improves the accuracy of the recognition results.
Keywords/Search Tags:Named entity recognition, Unregistered words, Double array Trie, GCN
PDF Full Text Request
Related items