Font Size: a A A

Research On Sentence Compression Algorithm Based On Deep Learning

Posted on:2021-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y F WangFull Text:PDF
GTID:2518306308469094Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Sentence compression is a task of compressing sentences containing redundant information into short semantic expressions,simplifying the text structure and retaining important meanings and information.With the development of the Internet,sentence compression has become one of the most important tasks.Compression technology has greatly reduced the information overload on the Internet,especially on mobile devices with limited screen space.However,the sentence compression task also faces a lot of challenges.In the current mainstream algorithms,neural network based compression algorithms lack external information guidance,and it is difficult to capture skipping long-distance information in sentences.At the same time,using unsupervised syntax analysis to obtain summary will inevitably produce parsing errors.Therefore,it is of great value to study the task of sentence compression.This paper researches and explores the basic model and technology of sentence compression tasks,and analyzes the research status and existing problems of sentence compression tasks.This paper focuses on extractive sentence compression,and researches from the introduction of external syntactic information and auxiliary pre-training.The main work is as follows:(1)Aiming at the problem of lack of external information guidance for neural networks,this paper uses syntactically dependent information to implement an extractive summary by combining a syntactic graph convolutional network and a sequence-to-sequence model,and proposes a new parallelized structure.This model combines the advantages of both to achieve complementary effects.In addition,in order to reduce the error propagation of the parse tree,this paper dynamically adjusts the dependency arc and optimizes the construction process of the syntactic graph convolutional network.Finally,in view of the imbalance of classification in the data set,this paper achieves better results by adjusting the compression threshold.Experiments show that the model combine with syntactic graph convolutional network is better than the original model and performs well on Google sentence compression dataset.(2)Aiming at the small size of the target data set,this paper proposes a transfer learning method that uses abstractive summarization as pre-training tasks.It combines the advantages of both abstraction and extraction methods.This two-stage training method shares feature representations and some model parameters between pre-training tasks and target tasks,improving the model's ability to extract potential features.At the same time,a new fine-tuning scheme is formulated in this paper.During the two-stage training process,different optimizers are used for the codec to avoid the mismatch of training cycles between the two.Experiments show that the two-stage training method further improves the effect of the extractive task.
Keywords/Search Tags:sentence compression, neural network, syntactically dependent information
PDF Full Text Request
Related items