Research On Key Techniques Of Two Phase Automatic Summarization Algorithm For Long Text

Posted on:2018-04-19

Degree:Master

Type:Thesis

Country:China

Candidate:S Wang

Full Text:PDF

GTID:2428330623450979

Subject:Management Science and Engineering

Abstract/Summary:

PDF Full Text Request

With the explosive growth of information on the Internet,it becomes more important to improve the efficiency of knowledge acquisition.Automatic text summarization techniques provide a good means for fast knowledge acquisition by compressing and refining information.The calculation of text similarity is the key step for the final effect of automatic text summarization task.It will greatly boost the accuracy of the summarization algorithm,and then improve the overall performance of the whole summarization system if the calculation of text similarity can be effectively improved.Aiming at the existing disadvantages of the calculation of text similarity from two aspects of literal and semantic calculation method,a new method is proposed for calculating the hybrid text similarity,in order to comprehensively measure the similarity between texts.In view of the existing automatic text summarization methods,when dealing with long text,exhibit poor accuracy,and fail to meet users' need for performance.In this paper,we propose a two-phase automatic summarization method for long text,namely,EA-LTS.Firstly,it employs a hybrid text similarity calculation method based on a graph model to extract key sentences.Then,it constructs a recurrent neural network encoderdecoder model with attention and pointer mechanisms to generate summaries.The development of evaluation methods and the progress of automatic text summarization technology are complementary,and the high-quality evaluation method is a more long-term development foundation for automatic text summarization technology.This paper makes a deep analysis on the system of the existing evaluation methods,and it is found that neither the external evaluation method or internal evaluation methods have considered the semantic similarity,therefore,this paper proposes a new evaluation method based on hybrid text similarity in order to make up for the lack of semantic similarity.In view of the biggest bottleneck of abstractive summarization technology development is the lack of high quality dataset.The experimental dataset for this paper is collected by a self designed topic crawler from real world Chinese data,and it contains about 0.5M articles and the corresponding titles.,through experiments on this real largescale long-text corpora,the effectiveness of EA-LTS is verified.The results were compared with several popular automatic summarization method in the ROUGE and HTS index,effect is improved obviously.Compared with the benchmark RNN method,25.8% were enhanced on the HTS index(word)and 20.1%(char)...

Keywords/Search Tags:

Deep Learning, Automatic Text Summarization, Text Similarity Calculation, Recurrent Neural Network, Graph Model, Sequence to Sequence Model, Attention Mechanism, Pointer Mechanism

PDF Full Text Request

Related items

1	Abstractive Document Summarization Based On Deep Sequence To Sequence Model
2	Research On Abstract Text Summarization Based On Sequence To Sequence Model
3	Research On Automatic Text Summarization Technology Based On Deep Learnin
4	Research On Text Summarization Technology Based On Deep Learning
5	Research On Text Summarization Method Based On LSTM Sequence To Sequence Model
6	Research On Chinese Abstractive Text Summarization Based On Sequence To Sequence Model
7	Research On Automatic Summarization Based On Pointer Generator Networks Model
8	Research Of Model For Abstractive Summarization Based On Deep Learning
9	Improved Seq2Seq Text Summarization Generation Methods
10	Research On Automatic Text Summarization Generation Technology Based On Deep Learning