Font Size: a A A

Research On Domain Adaptation For Chinese Word Segmentation Based On Parameter Transfer Learning

Posted on:2021-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:X X GuoFull Text:PDF
GTID:2428330614971022Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Natural language processing(NLP)is a key research direction in the field of artificial intelligence,and it is a technical means for machines to understand human language.Chinese word segmentation is a basic and important task in chinese natural language processing.Its segmentation effect will play an important role in the downstream tasks of natural language processing,such as sentiment analysis,machine translation,information extraction,knowledge mapping,etc.In recent years,with the popularity of deep learning in various fields,neural network model has been put on the stage in a gorgeous way,which has been widely favored,and natural language processing field is also in it.The method based on neural network has become a mainstream word segmentation method nowadays.Once the method is proposed,the performance of chinese word segmentation will get a qualitative leap,and the effect is better than the traditional word segmentation method obviously.However,this method needs large-scale tagging corpus,and the acquisition of tagging corpus needs manual tagging,which requires a lot of human resources.It is obviously not realistic to make a tagging corpus for each research field.Nowadays,most of the tagging corpora available belong to the field of news.The segmentation system trained by the data in the field of news is directly used in other fields,and its performance will decline obviously,which is the domain adaptation problem for segmentation.This problem is an urgent problem to be solved in the field of chinese word segmentation.Therefore,the research on the domain adaptation for chinese word segmentation has a very important theoretical and practical significance.Under this background,we propose a domain adaptation method for chinese word segmentation based on parameter transfer,which improves the domain adaptation for word segmentation model by sharing model parameters.The main work of this paper is as follows:(1)Based on the analysis of Bi GRU-CRF neural network model,this paper proposes to integrate Bert pretraining language model to obtain more abundant semantic information,that is,Bert-Bi GRU-CRF neural network word segmentation method.The experiment results show that the segmentation effect of this method is better than that of Bi GRU-CRF;(2)In order to solve the domain adaptation problem for chinese word segmentation,this paper proposes a domain adaptation method for chinese word segmentation basedon parameter transfer,which realizes the effect of parameter transfer by sharing the network parameters and features of Bi GRU layer;(3)Based on the parameter transfer method,a discriminator is added.The discriminator can generate a loss and return it to the Bi GRU layer for updating the network parameters of the Bi GRU layer,which further enhances the domain adaptability of the model.Four experiments are carried out to verify the validity of the domain adaptation method based on parameter transfer.The larger the scale of unlabeled corpus in the target area,the better the segmentation effect.Compared with other domain adaptation segmentation methods,the proposed method is proved to be feasible and effective.
Keywords/Search Tags:Parameter transfer learning, Chinese word segmentation, Domain adaptation, Neural network, Natural language processing
PDF Full Text Request
Related items