| Extractive summarization,as a key technology for dealing with information overload,can effectively solve the problem of high cost of manual summarization,and is widely used in search engines,intelligent writing.With the development of deep learning,extractive text summarization has achieved better results.Among them,the extractive text summarization model based on graph model is very popular.However,the research found that there are still some problems with the current graph model: how to use the graph model to update the deep semantic representation between nodes;how to construct a graph to reduce the mistransmission of information during the graph update process;how to integrate richer semantics in the sentence vector representation.Based on the above problems,this thesis proposes two improved models based on the HSG model.The main research work is as follows:(1)Aiming at the problem of insufficient node classification ability and fusion node semantics in the graph attention network used in the graph update layer of the current graph model,an extractive text summarization model based on HGT is proposed.In the graph update layer,HGT is used to replace the graph attention network to perform semantic update between nodes to integrate deeper semantic information about nodes;and during the training process,it is found that no position information is included,so the trainable position is encoded.Add to the graph update layer to speed up the convergence of the model and further improve the performance of the model.(2)Aiming at the problems of information mistransmission in the process of graph update and how to enrich the semantic representation of sentence vectors,this thesis proposes an extractive text summarization model based on part of speech.In the embedding stage of the model,the part-of-speech tagging information about words is integrated into the sentence to enrich the sentence representation;in the graph update layer,the representation between nodes is updated according to the part-of-speech tagging and the edges established by the common words;In order to solve this problem,Bi LSTM is used to integrate the semantics between sentences to alleviate this phenomenon.Experiments show that the two improved models proposed in this thesis have improved in the three evaluation indexes of the extractive text summarization direction,and the extractive text summarization also has better generality. |