Font Size: a A A

Research On Attention Mechanism For Natural Language Processing

Posted on:2020-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:L X LiFull Text:PDF
GTID:2428330572476346Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As the world enters the information age,the prevalence of the Internet has promoted the explosive growth of data.There is a huge use value among these data.However,it is already unsuccessful to analyze by human alone in the face of such a huge amount of information.The need to use machines for natural language processing analysis is growing.There are some differences to process different language.Chinese word segmentation is one of the important steps in Chinese natural language processing.There is not only certain value in practical engineering application but also certain reference significance for other related natural language processing tasks for the research on Chinese word segmentation task.The current Chinese word segmentation model is mainly based on the recurrent neural networks which has the shortcomings of long-distance information interactive learning and the high computational time cost.The computing layer based on the self-attention mechanism can solve these problems to some extent.This paper is for the research of the Chinese word segmentation task based on the self-attention mechanism,and the main work is as follows:Firstly,the self-attention mechanism is introduced into the field of Chinese word segmentation,and a Chinese word segmentation networks based on the self-attention mechanism is proposed.The model combines the advantages of both convolution and self-attention calculations,enabling both short-distance information dependence and long-distance information dependence.The experiment results show that the introduction of the self-attention mechanism has improved the performance of the Chinese word segmentation model system,and much faster than the traditional recurrent neural networks model.Secondly,a Chinese word segmentation model based on BERT pre-training is proposed through the analysis of the nature of network layer parameter.The model enhances the ability of the network to represent text through pre-training from a large amount of unlabeled corpus data based on the sub-structure calculation layer using the self-attention mechanism,and improving word segmentation performance.The method utilizes the transfer learning idea which pre-train on a large amount of unlabeled data and fine-tune on a small amount of labeled data,which avoids the problem of difficult collection of labeled data.The experiment results show that this method can learn the representation of text in different contexts effectively,and thus improve the performance of Chinese word segmentation model.The main contributions and innovation of this paper are as follows.The network model proposed combined with self-attention mechanism and convolution calculation can improve the performance of Chinese word segmentation task,and can use parallel computing to dramatically reduce the computation time of the network.The network model based on the BERT pre-training can give different representations for the same characters in different contexts,thus dramatically improving the performance of Chinese word segmentation.
Keywords/Search Tags:Chinese word segmentation, sequence tagging, self-attention, pre-train
PDF Full Text Request
Related items