Font Size: a A A

Research On Automatic Generation Method Of English Text Headline

Posted on:2021-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:X W MaFull Text:PDF
GTID:2428330620963499Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,due to the rapid development of the Internet,people can access a large amount of text data every day,and the explosive growth of information lead to generate massive data.Faced with these massive amounts of text data,how to select the required content quickly and save reading time effectively has become an urgent issue.The text summarization and headline can reflect the main content of the text,allowing readers to effectively filter and read,so automatically generating text summarization and headline have very important uses in information overload.With the development of deep learning,generative automatic text headline models are widely used.Generative methods rely on understanding the semantics of text and expressing semantic information to generate headline.However,generating a high-quality headline is a challenging task in practice since the computer lacks human language capability to understand the entire text and then generate a headline that reflect core content.The sequence-to-sequence model is widely used in many natural language processing tasks,and it also provides new ideas for text headline generation tasks.Based on the sequence-to-sequence headline generation model,it is necessary to encode the semantic information of the text,understand the semantic relationship of the text,and generate a headline that matches the central content of the original text.This paper focuses on the sequence-to-sequence model to study the generation method of English text headline.The main tasks are as follows:(1)Text headline generation based on sentence-level LSTM encoding.The headline generation model with sequence-to-sequence is used to represent the context and semantic information of the text during the encoding stage.This paper proposes a text representation method based on sentence-level LSTM encoding,which encodes and represents each word in the text in parallel,and constructs a global sentence-level state and the sub-states of each word,and the recurrent step is used to exchange information between the local state of words and the global state of the overall text.After encoding the text to obtain a semantic representation,and then uses the decoder of the mixed pointer network to generate the headline.Experimental results on related data sets show the effectiveness of the model in understanding text.(2)Text headline generation based on feature and multi-head attention mechanism.In the generative model,the linguistic feature vector of the vocabulary and the original word vector are integrated to improve the semantic relevance of the generated headline and the text.At the same time,the multi-head attention mechanism is used in the attention part to obtain more levels of features from differently represented subspaces,so that the model can fully obtain the context information.Finally,the multi-head attention distribution is integrated into the pointer network as a decoder to generate the headline.Experimental results show that the model can improve the quality of generated headlines.
Keywords/Search Tags:Headline generation, Sequence-to-sequence model, Sentence level, Attention mechanism, Pointer network
PDF Full Text Request
Related items