Research On Key Techniques Of Query-focused Multi-document Summarization

Posted on:2009-09-23

Degree:Doctor

Type:Dissertation

Country:China

Candidate:L Zhao

Full Text:PDF

GTID:1118360272958840

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the quick development of Internet and increasing amount of text information, the requirement of searching from large amount of texts to get useful information has made automatic summarization more and more important. Automatic summarization means summarizing from single or multiple documents to get generalized content automatically. It can save much time for the users when browsing. This task is related to multiple aspects in the area of natural language processing, which is a big challenge for the computer. We described our research work on the technique of automatic summarization in this thesis.We have done much work on query-focused multi-document summarization and automatic evaluation of summary coherence. We have realized several summarization systems on the basis of participation in the DUC evaluation in recent years.We use CME model for machine learning based automatic summarizer. Furthermore, in order to find the semantic relatedness between sentences and the queries, we proposed a method of semantic extension which is applied to the summarization system. In this method, sentence vectors can be semantically extended based on the Synset and different word relations defined in WordNet. In this way, semantic information can be combined into the sentences and the performance of the summarization system gets obvious improvement.We also proposed a method of query expansion based on graph-based ranking algorithm, which is combined into the query-focused summarization system to solve the problem of information paucity in the original query. This method makes use of context information to expand the query, which can obtain more relevant information with less noise. The summarization system with query expansion has obtained significant performance improvement compared to without expansion, and we have achieved the state-of-the-art performance on the evaluation data from DUC.Another important problem is the summary evaluation. Currently the evaluation on linguistic quality relies on manual evaluation, which is time-consuming, so it is important to develop automatic method. We have studied the entity-based coherence model and improved it from both feature calculation and entity selection. In both ways we have improved the base model and got higher accuracy in the experiments.

Keywords/Search Tags:

automatic summarization, natural language processing, machine learning, summary evalution, text coherence

PDF Full Text Request

Related items

1	Research On Automatic Text Summarization Algorithm For Chinese And English Long Text
2	Automatic Summarization System Based On Natural Language Processing
3	Automatic Summarization Of Multimedia Information And Related Technology Research,
4	Research On Automatic Text Summarization Generation Technology Based On Deep Learning
5	Research And Application On Automatic Summary Algorithm Based On Multiple Models
6	Research On Automatic Text Summarization Based On Deep Neural Networks
7	Research And Application Of Text Summarization Model Based On Deep Learning
8	A Transferable Approach To Generating Abstractive Text Summary Based On Pre-trained Language Model
9	Design And Implementation Of A Cross-lingual Text Summary System Based On Deep Learning
10	Research On Automatic Text Summarization Based On Self-Attention Mechanism