Font Size: a A A

Research On Named Entity Recognition Based On Deep Learning Multi-feature Fusion

Posted on:2022-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhouFull Text:PDF
GTID:2518306482493654Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Named entity recognition is a subtask of natural language processing and an important technology for mining useful information from a large amount of text data.Deep learning technology has been widely used and researched in the field of natural language processing.Its powerful feature learning ability can effectively mine the deep semantic information of the text,and the effective feature expression solves the problem of insufficient expression of Chinese features in the task of extracting Chinese named entities.Chinese named entity recognition has been widely used in various fields,mainly to identify named entities in specific fields.For example,in the medical field,named entity recognition mainly recognizes the body parts of patients,diseases,treatment methods,symptoms,etc.,and also recognizes commonly used named entities,Such as the patient's name,place name,etc.The main difficulty in completing these named entity recognition tasks is that the existing model Chinese vector feature representation is too single,which leads to the problem of poor performance of the entity recognition model.Therefore,in response to the above problems,this paper adopts the deep learning method,uses the BiLSTM-CRF model as the benchmark model,and introduces two internal features of Chinese strokes and radicals to improve the performance of the named entity recognition model.The specific work content is as follows:(1)Aiming at the problem of insufficient representation of the potential features of Chinese characters,this paper uses a Bidirectional Long Short-term Memory neural network(BiLSTM)to extract basic features of strokes and radicals.Based on the above two features,a Chinese clinical named entity recognition model based on stroke and radical features is proposed.This method can not only capture the stroke dependence in Chinese characters,but also enhance the semantic representation of Chinese characters,thereby improving the recognition ability of the model.The model was tested on the CCKS-2017 Task 2 benchmark data set,and the accuracy of the model reached 93.66%,and the F1 score reached 94.70%.Compared with the basic BiLSTM-CRF model,the accuracy of the model is increased by3.38%,the recall rate is increased by 1.05%,and the F1 value is increased by 1.91%.(2)Aiming at the problem of insufficient representation of the potential features of Chinese characters,and in order to better and more balance the fusion of the two basic features of strokes and radicals,this paper proposes a multi-feature adaptive fusion Chinese named entity recognition model,using weighted concatenation The method adaptively fuses two sets of features.The model was tested on Microsoft Research Asia(MSRA)and the "People's Daily" data set from January 1998 to June 1998,and the F1 values were 97.01%and 96.78%,respectively.Based on the above experimental results,it is shown that effective feature representation can improve the recognition ability of the named entity recognition model.
Keywords/Search Tags:Named entity recognition, Deep learning, Stroke features, Radical features, BiLSTM-CRF
PDF Full Text Request
Related items