With the rapid development of Internet,life has become information-based,but it also brings many problems.Information overload is one of the core problems to be solved.It is very necessary to reduce the "dimension" of massive information on the Internet.Text summarization technology is one of the important means to solve this problem.The development of deep learning promotes the progress of text summarization technology based on neural network.However,the existing research shows that the traditional neural network can not effectively encode the long text sequence because of the information loss caused by the long dependence in the case of long text.The current mainstream text summarization methods are based on the neural network model from sequence to sequence structure,which makes use of the long-distance dependence of recurrent neural network to encode the context semantics of the input text.Context semantic coding generally only contains the serialization information of the text,which can’t reflect the structure information of the text,and the context contact information can’t be fully learned by the encoder.In order to make full use of the context structure information of the text,this paper makes a series of research on the context structure information,aiming to get accurate and smooth text summary combined with the text structure information.On the basis of sequence to sequence generation text summarization model,this paper makes a series of improvements to it.Taking the structural information of text into account,according to the different categories of structural information,two different improved sequence to sequence text summarization models are proposed.Firstly,this paper proposes an attention text summarization model which integrates text semantic structure.By introducing the syntactic structure information of the text,the context vector obtained from the attention layer contains both the text semantic information and the syntactic structure information.The context vector integrated with the syntactic structure information is provided to the encoder to generate the text summarization.The experimental results on the rouge index show that the performance of the improved sequence to sequence text summarization model has been improved to a certain extent.Secondly,this paper encodes the text level information,uses the self-attention mechanism to obtain the long-term dependence between words,integrates the global structure information of the text into the context semantic information of the input text,and uses the judgment and reasoning ability of the variational auto encoder network to write the latent style structure information of the text.The context information includes global structure information and context dependent information,and the encoder generates text summarization with fixed similar structure according to latent style structure.From the experimental results,the method has a significant improvement in the rouge evaluation index,which shows that the text summarization model based on text structure and latent structure can effectively learn the global structure information and latent style structure of the text,and has an obvious role in improving the accuracy of text summarization. |