Font Size: a A A

Named Entity Recognition Model And Its Compression Method For Oral Chinese Text

Posted on:2023-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:X XuFull Text:PDF
GTID:2568307118972559Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Smart city has become a new development theme of cities in China.How to create more convenience for urban residents through artificial intelligence technology is the development goal of smart city.Natural language processing technology,as the mainstream way of human-computer interaction in the future,plays a very important role in artificial intelligence technology.In the daily human-computer interaction process,the occurrence of spoken text greatly increases.Whether the named entity recognition technology can accurately extract the key entity information in spoken text directly determines the computer’s ability to understand human semantics,and further affects the interactive experience of downstream tasks such as search,recommendation and question-and-answer.Currently,the mainstream natural language processing technologies rely on large-scale unsupervised pre-training language models,supplemented by finetuning technology in different subdomains.As the volume of language models increases,their response speed will become significantly slower with limited computing power,which will affect the interactive experience.Therefore,it is necessary to study model compression techniques related to language models.This paper constructs a named entity recognition dataset ULNER in the context of urban life with a large amount of spoken text,and proposes a named entity recognition model PERT-CRF-Restorer suitable for spoken text.In addition,to reduce the loss of precision during model compression,an improved knowledge distillation framework,MSDS-KD,is proposed.The specific work of this paper is as follows:(1)A large amount of spoken text was crawled from a city-level network forum,the text characteristics in this field were analyzed,and the target of entity recognition task was determined.Further cleaning of the corpus,labeling basic entities with remote monitoring,defining the domain entity dictionary and labeling domain entities through the dictionary,and finally manually modifying the labels to construct the ULNER dataset.For the task of entity recognition in spoken text,the first step is to guide model learning to detect the maximum boundary of entities,the second step is to restore the word order information in the learning label of Restorer module to the detected entities,and the PERT-CRF-Restorer method is proposed,which has better recognition accuracy than many mainstream recognition schemes.(2)For the Teacher-Student Gap problem in the knowledge distillation process,using the standard deviation of the output value as the evaluation criterion of sharpness,the feasibility of approaching the temperature value with gradient descent method under the condition of fixed standard deviation of the output probability distribution of samples is verified,and the standard deviation of the output probability distribution of the Teacher-Student Gap is reduced by matching the standard deviation between the teacher model and the student model.By controlling the standard deviation of the soft label of the teacher model with a linear function,increasing the standard deviation and constraining the upper and lower bounds during the training process to match the learning trend of the model in the training process,a knowledge distillation method MSDS-KD based on the soft label standard deviation matching is proposed,and its distillation effect is superior to several mainstream distillation methods in the related classification tasks.
Keywords/Search Tags:Spoken text, Named entity recognition, ULNER, PERT-CRF-Restorer, Distillation of knowledge, Teacher-Student Gap, Gradient descent, MSDS-KD
PDF Full Text Request
Related items