Font Size: a A A

Research On Chinese Named Entity Recognition Method Based On Deep Neural Network

Posted on:2024-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:K F LongFull Text:PDF
GTID:2568307058982249Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
With the advent of the Internet era,more and more textual information has emerged around every person.How to obtain useful knowledge from massive text information has become a hot research topic both at domestic and overseas.Named Entity Recognition(NER)is an important task in Natural Language Processing(NLP),which aims to quickly extract specified types of entity information from complex natural language text,and it provides a foundation for Information Extraction to generate structured data.Compared with English Named Entity Recognition,Chinese Named Entity Recognition(CNER)faces more difficulties.Firstly,Chinese text does not have spaces as separators between words like in English,so how accurately identifying entity boundaries becomes a major challenge for CNER.Secondly,named entities are highly dependent on the context,so how to capture the semantic information in the text as well as the contextual knowledge is also a major challenge for CNER.In addition,with the development of the internet,more and more people focus on privacy protection.For example,in Medical Named Entity Recognition tasks,the dataset mostly consists of electronic medical records containing patient identity information,disease information,and treatment plans,which are very sensitive and private to patients.Therefore,it has become a major challenge for CNER to accurately identify the entities without compromising private information.To deal with the above problems,in this thesis,we conducted system research on Chinese Name Entity Recognition based on Deep Neural Network.The main research contents include the following points:(1)In order to solve the problems such as insufficient extraction of semantic features and boundary information in existing methods,this thesis proposes a Chinese Named Entity Recognition method that incorporates many different feature information in the embedding layer.First,to obtain the lexical weight information possessed by each character,this thesis uses a selfattention mechanism to integrate the word knowledge of each character in the input sequence.The bigram information of each character in the input sequence is also obtained.Then,the Convolutional Neural Network is used to obtain the contextual knowledge of each character.Finally,to enable obtaining longer distance contextual feature knowledge,this dissertation uses a Convolutional Neural Network to obtain the contextual knowledge of each bigram information.After obtaining word features,bigram features,contextual features,and bigram contextual features for each character,four different concatenation strategies are used to fuse the features in the embedding layer.The fused features are obtained and fed to the encoding layer for encoding and decoding the labels using the decoding layer.Experiments on three datasets show that the model proposed in this dissertation can effectively improve the performance of CNER.(2)In this dissertation,we propose a cryptographic learning framework to reduce problems such as data leakage and inconvenient disclosure of sensitive data.This dissertation introduces hashing algorithm and attribute-based encryption algorithm in CNER for the first time to enhance the security of data.In addition,this dissertation constructs a new dataset on Chinese history.Experimental results on six Chinese datasets show that the encryption method achieves satisfactory results.The performance of some models trained with encrypted data even exceeds that of the unencrypted method,which not only verifies the effectiveness of the introduced encryption method but also solves the data leakage problem to a certain extent.(3)We have implemented an online recognition platform for named entities in the Chinese history domain.This part relies on the research content of this dissertation and takes Chinese historical texts or cipher texts as input and outputs them as corresponding entity types with sequence labels.
Keywords/Search Tags:Chinese Named Entity Recognition, Deep Neural Network, Data Protection, Attention Mechanism, Recurrent Neural Network
PDF Full Text Request
Related items