Font Size: a A A

Research On Chinese Text Detection In Natural Scene Based On Deep Learning

Posted on:2022-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q YangFull Text:PDF
GTID:2518306557468174Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Named entity recognition is an important fundamental task in the field of natural language processing.With the rapid development of Internet and big data technologies,all kinds of information on the Internet are expanding at an alarming rate.Especially on the self-media,social media or other online platforms,text data information increases by hundreds of millions every day,which makes the online text information unstructured,diversified and fragmented.In order to efficiently mine valuable information from a large amount of text,named entity recognition technology becomes an important research hotspot as a sub-task of information extraction.At present,the research method of Chinese named entity recognition mainly adopts deep learning technology,and various Chinese named entity recognition algorithms based on deep learning have achieved good recognition results.However,the existing algorithms still suffer from inadequate extraction of network features,noise-containing or other problems.Therefore,this paper will focus on the deep learning-based Chinese named entity recognition algorithm and make optimization improvements to improve the performance of the model in Chinese named entity recognition tasks.The main research work of this paper is as follows.(1)A Chinese named entity recognition algorithm based on adversarial transfer learning and convolutional neural networks(abbreviated as: ACNN-Bi LSTM-CRF)is proposed.The convolutional neural network is incorporated into the Chinese named entity recognition model based on adversarial transfer learning.With the advantage of CNN to extract features,CNN and Bi LSTM networks are combined to capture local short-range and global long-range contextual information,which makes the features obtained by the network more accurate and sufficient.(2)A Chinese named entity recognition algorithm based on the self-attention word association model(abbreviated as: SA-Lattice)is proposed.The self-attention mechanism is introduced into the model based on joint character-word learning,which can capture important word information from multiple matching words,then update them into a fixed size word vector containing important features,allowing the network to improve in training speed and to extract more comprehensive word features.(3)A Chinese named entity recognition algorithm based on a denoised word association model(abbreviated as Gated-Lattice)is proposed.By incorporating the Gated denoising neural network into the model based on joint character-word learning,the input word features are fine-tuned by using the denoising neural network to make the word features delivered to the word association model more accurate,allowing the model to focus on learning the features related to named entities,thus improving the recognition results.The proposed method is experimented on three publicly available datasets,and we use uniform evaluation criteria for validation and comparison with existing algorithms.The final experimental results on the test set show that the Chinese named entity recognition method proposed in this paper achieves better results,and can effectively improve the accuracy of Chinese named entity recognition.
Keywords/Search Tags:Bidirectional Long Short-Term Memory, Convolutional Neural Network, Self-attention mechanism, denoising mechanism, Chinese named entity recognition
PDF Full Text Request
Related items