Research On Extractive Sentence Compression Techniques For Cross-domain

Posted on:2019-03-15

Degree:Doctor

Type:Dissertation

Country:China

Candidate:L G Wang

Full Text:PDF

GTID:1488306470993579

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Artifical intelligence(AI)has gained a lot of attention in recent years.Natural Language processing(NLP)is one of the most important research directions of AI while automatic summarization is one of the most important NLP applications,as one of the components of automatic summarization,sentence compression(SC)is also seen as sentence-level automatic text summarization.The goal of SC is to compress long,verbose sentences into short,concise ones that are grammatical while retaining the most important pieces of information.This task could be classified into two categories:extractive or generative according to whether new words are generated during compressing.Extractive setting assumes all the words in the compressed sentence are all come from the original sentence and no new word generated,while generative setting assumes the words of compressed sentence could be different from that of original sentence.This thesis takes extractive settingPrevious work could be classified based on whether labelled data are needed.Among methods which need labelled data,the best method so far is the sequence to sequence deep neural network model.The model first encode the source sentence by a recurrent neural net-work,then another recurrent network is used to decode the sentence word by word.At each time,the label of the word indicating whether the word is going to be reserved or deleted is generated.Among the methods which do not need labelled data,the current best method is an integer linear programming model,this model set an optimized objective function with sev-eral manually defined constraints and then solve it by an integer linear programming solver Then the optimal solution of the interger linear programming problem is the required com-pressed sentence.There are two disadvantages for deep neural network model using labelled data:1)The model needs lots of labelled data,which is time and labor consuming.2)The model has a weak ability of domain adaptation.Besides,when no labelled data used,the integer linear programming method has high time complexity.Aimed at solving these problems,this thesis has the following researches and novelties(1)As the sequence to sequence SC model has a week ability of domain adaptation,this thesis adopts the framework of transfer learning.Three different auxiliary tasks which are related to sentence compression are proposed to build neural networks for improving the sequence to sequence model across domains.The experiment result showed that the three auxiliary tasks are all helpful for domain adaptation(2)Since the current sequence to sequence SC model needs lots of training data and can not be generalized beyond domain data well,this thesis proposes to combine neural network and interger linear programming,and embed syntactic features to word vectors.The SC model is a lexicalized model and syntactic features have a better generalized ability.We embed part of speech and dependency types in dependency parsing tree into word vector and borrow the idea from interger linear programming method which find a global optimaized solution of the problem.The experiment result indicated that this method was able to reduce the need for training data and generalize beyond domain data better(3)Aiming at the high time complexity of interger linear programming model,this the-sis proposes to solve SC problem with deep reinforcement learning.Extractive SC could be seen as a problem of sequential decision making:each time a decision is made to delete a word according to current words.With no labelled data needed,we firstly use deep reinforce-ment learning to model this task.The experiment result showed that the deep reinforcement learning has a better time performence than that of the interge linear programming as well as comparable SC results.

Keywords/Search Tags:

automatic summarization, sentence compression, deep learning, deep reinforcement learning, transfer learning

PDF Full Text Request

Related items

1	Research And Application Of Automatic Text Summarization Technology Based On Deep Learning
2	Research On Reinforcement Learning Based Control Method Of Magnetic Navigation AGV
3	Research And Implementation Of Multilingual Automatic Summarization System Based On Deep Learning
4	Research And Application Of Text Summarization Model Based On Deep Learning
5	Optimization Design For Deep Belief Network And Its Applications
6	The Research On Abstractive Text Summarization Method Based On Reinforcement Learning And Sentence Level Evaluation
7	Automatic Summarization Of Academic Literature Based On Deep Learning
8	Research On Abstractive Text Summarization Based On Deep Learning
9	Research On Automatic Summarization Of Chinese Documents Based On Deep Learning
10	An Automatic Summarization Model Based On Deep Learning For Chinese