Font Size: a A A

Research On Sentence Compression And Its Application

Posted on:2014-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ZhangFull Text:PDF
GTID:2248330398465369Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the tremendous increasing of information, the demands of information frompeople advanced the development of Natural Language Processing (NLP). As a consequent,Sentence compression, which is an important part of automatic summarization, drawsmuch more attention. Sentence compression has been widely used in automatic titlegeneration, Searching Engine, Topic detection and Summarization.Nowadays, the mainstream methods of sentence compression are based on SupervisedLearning model. Following this, this paper explores an ingredient syntactic tree methodbased on discriminative model. The main contributions can be concluded as follows:1. Research on sentence compression via structured learning method. First, we buildChinese parallel corpus by matching extraction model; and then we propose a corpusexpansion method. Finally, we build a sentence compression system based on structuredlearning method. Experiments show that the structured learning method performs a goodcompression ratio and remains the main information of source sentence. It also shows thatthe corpus expansion method we proposed is effective.2. Research on the decoding algorithm. Under the framework of discriminative model,this paper presents a decoding method based on Integer Linear Programming (ILP), whichconsiders sentence compression as the selection of the optimal compressed target sentence.The ILP based system maintains a good compression ratio while remaining the maininformation of source sentence.3. Research on the evaluation metrics of sentence compression. Since the missing ofuniformed evaluation metrics, this proposes two automatic evaluation metrics (BLEU andN-gram) within the framework of sentence compression via deleting words andconstituents.4. Research on the application of the sentence compression system. This paper appliesour sentence compression system in Multi-document Summarization task. Experimental result shows that the compression system can delete some parts of invaluable informationwhile remain the grammaticality.
Keywords/Search Tags:Sentence Compression, Structured Learning, Integer Linear Programming, Multi-document Summarization, Natural Language Processing
PDF Full Text Request
Related items