Font Size: a A A

Research On Chinese Named Entity Recognition Based On XLNet And Word Segmentation Fusion Coding

Posted on:2021-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:J W YangFull Text:PDF
GTID:2428330626458914Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As one of the sub-tasks in the field of natural language processing,named entity recognition has become a hot research object in the context of artificial intelligence,and it is the core problem in many applications such as information retrieval,machine translation,and intelligent question answering.Chinese named entity recognition is relatively difficult compared to English named entity recognition,because the smallest language element of Chinese is a word,and there is no obvious symbolic boundary between words.In order to further improve the recognition efficiency of Chinese named entities,this paper proposes a Chinese named entity recognition method based on XLNet model and word segmentation fusion coding.First,the method in this paper makes Chinese named entity recognition a new application scenario for the XLNet model.The Chinese named entity recognition method based on XLNet inherits the advantages of the Transformer model and overcomes the limitation of poor parallelism of traditional recurrent neural networks in natural language processing.In addition,named entity recognition based on the XLNet model introduces a pre-trained model,which captures a large amount of prior knowledge of the context under large-scale corpus pre-training,and the parameters of the pre-trained model is then fine-tuned in downstream task to obtain the final target model.At the same time,another innovation of this paper is the fusion coding of word segmentation in the input sequence during the word embedding stage,which not only overcomes the limitation of the difficulty of Chinese text segmentation,but also can take into account internal relevance of the input text.In addition,in order to combine the theory and practice,the work of this paper also includes a Chinese named entity recognition demonstration system,the purpose of which is to visualize the entire process of Chinese named entity recognition to show the superiority of the algorithm in this paper.In the experimental part of this paper,three data sets were used,namely the 1998People's Daily data set,the Boson data set,and the MSRA data set.Besides,the experimental results were compared with other three excellent algorithms on different data sets.The experimental results show that the method in this paper has improved the precision rate,recall rate and F1 value compared with the comparison algorithm.
Keywords/Search Tags:named entity recognition, pre-trained model, word fusion coding, deep learning
PDF Full Text Request
Related items