Research And Implementation Of Key Technology Of Chinese Automatic Summarizing

Posted on:2019-03-23

Degree:Master

Type:Thesis

Country:China

Candidate:H R Zhang

Full Text:PDF

GTID:2428330566497300

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Internet,a large number of text data are generated every day.Summarization is the main content of the text.Automatic summarization provides a quick way to understand the content of the original text.At the same time,automatic summarization research has a wide and important application scene,such as Web search engine summary,knowledge fusion of question answering system,hot spot and topic tracking of public opinion supervision system.Therefore,the research of automatic summarization will promote the development of the whole Natural Language Processing.This paper mainly studies Chinese extractive and abstractive automatic summarization.For extractive summarization,five kinds of common methods are investigated and realized: rule based and statistical method,graph based model method,integer linear programming,word vector packet method and machine learning method.And the focus of this paper is that in the method based on graph model,many methods have been completed to improve the sentence similarity calculation.Compared with the traditional graph model method,the effect is improved obviously.In the machine learning method,the word cha racter,the dependency syntactic feature,the name of the name of the life body,the word vector and the statistical feature are fused.It forms a 115 dimensional rich and representative feature vector space.In this paper,the abstract task is taken as a regression problem,which avoids the disadvantages of classifying abstraction as the sample category of two classification problems and cannot complete long summarization.And the method of calculating the regression value label is put forward creatively.For abstractive automatic summarization,this paper uses the deep learning of sequence to sequence(Seq2Seq)model.The decoder predicts the sequence of target words based on the abstract representation of the source language by the encoder.It is this abstract representation that provides the possibility of generating automatic summarization.Although we implement the abstractive Automatic Summarization Based on the deep learning model,there are still many drawbacks,such as generating duplicate words.In order to facilitate the display,this paper finally implements a Django system to invoke the experimental interface and present the result summary of each method.

Keywords/Search Tags:

automatic summarization, feature vector space, Seq2Seq, regression

PDF Full Text Request

Related items

1	Research And Implementation Of Automatic Text Summarization Based On Seq2Seq Model
2	Research Of Automatic Summarization Oriented To News Text
3	Research On Chinese Automatic Summarization System
4	Research Of Chinese Text Automatic Summarization Based On Conceptual Vector Space Model
5	Research On Automatic Summarization System Based On RSS
6	Research And Implementation Of Multilingual Automatic Summarization System Based On Deep Learning
7	The Research And The Applications Of Automatic Summarization Technology
8	Research Of Automatic Summarization Based On Named Entity
9	Research On Automatic Generation Method Of Chinese Text Summarization
10	Research On Chinese Automatic Summarization And Its Evaluation Method