| With the development of Internet technology,more and more social media network platforms have gradually emerged,bringing people new ways to obtain information and becoming the main source of public information in people’s daily life.Compared with traditional news media,social media networks such as Weibo and Toutiao have better interactivity and timeliness,and have richer forms of news representation,which can help people obtain relevant information as soon as news occurs.But not all news in social media networks is critical information and contains a lot of redundant content.People may need to spend more time getting the information they need from the news.Under the above background,this paper conducts research on text automatic summarization for Chinese news,trying to use automatic text summarization technology to summarize the target news text,help the public to quickly understand the key content of the news,and improve the efficiency of news reading.First,this paper proposes an extractive summarization model for Chinese news.The model uses BERT to combine external features such as word frequency and location to extract word embeddings that match the characteristics of news texts,input them into the Bi-GRU model for training,and screen candidate abstract sentences containing news topic-related information.The experimental results show that the extractive summarization model obtained by training can effectively capture the key information in the news,and ensure that the summary results are coherent and readable.Second,this paper proposes an abstractive summarization model for Chinese news.The model uses BERT to obtain word embedding representation,and combines multi-head attention mechanism to obtain summary results in the decoding stage of the pointer generation network model,making full use of the information learned in different semantic subspaces.At the same time,the model incorporates the coverage mechanism to ensure that fewer repetitive text fragments are generated during decoding.The experimental results show that the abstractive summarization model proposed in this paper can effectively improve the effect of summarization generation,and can effectively solve the problem of generating redundant content.Finally,this paper constructs a hybrid summarization model for Chinese news.The extractive summarization sub-module of the model combines RoBERTa and multi-dimensional external features to obtain the vectorized representation of the news text,which is sent to the Bi-GRU and classification layer to screen candidate summary sentences.The abstractive summarization submodule of the model combines UniLM with the coverage mechanism to generate summaries from candidate summarization sentences.Model training adopts the AELoss loss function proposed in this paper to jointly train two sub-modules.Experiments show that the model proposed in this paper achieves the current state-of-the-art results on two public Chinese news summary datasets.Compared with using extractive/abstractive summarization models or other hybrid summarization models alone,the model proposed in this paper significantly improves the automatic evaluation index of summaries,and the generated summary results are closer to standard summaries. |