Font Size: a A A

Storyline Extraction Based On Deep Learning For News Articles

Posted on:2020-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:L S GuoFull Text:PDF
GTID:2428330623459858Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,a large amount of information have been generated on the Internet,and online news media sites are continuously producing and propagating various kinds of events that happened in each day.It is difficult for the public to acquire what they want when facing such a huge amount of information.Storyline extraction,aims to automatically extract and track hotspot events from massive news texts,and reveal how those events evolve over time in a structured representation way.It enables people to know the development of skeleton of current hot events when facing such a large number of news articles.Therefore,it has very important practical significance and application value.Many storyline extraction methods from news article have been proposed.Most of existing approach are beyesian probabilisitic graphical model,and they are unsupervised.It's that the way that unsupervised methods learn are more close to human which are more stable and versatile,moreover it does not require annotated data.Thus it is widely concerned by researches.However,graphical model usually have complex structure and high time complexity.Hence,it is hard to apply to real-world application.Compared to the traditional methods,deep learning methods can automatically learn the implicit semantic information from the massive text and have the ability to mine deep semantic feature.Therefore,it has achieved remarkable performance in many natural language processing tasks.Thus,this paper studies unsupervised storyline extraction methods based on deep learning for news articles,and we hope to combine the advantages of unsupervsed learning and deep learning into a unified framework and mine the deep semantic features in text without annotating data.The main work of this paper are:(1)In order to solve the problems of complex derivation and long convergence time of existing storyline extraction methods,we propose a neural network based storyline extraction model(NSEM).The design of our model is based on the proposed two similarity hypothesis about the main body and title in news article,and we optimize the model parameters by a pairwise ranking loss.In our model,the event extraction and storyline construction are incorporated into a unified framework.Neural network based model can take advantage of the rich semantic information in text.We compared our method on three news datasets,and the experimental results show that the precision,recall and F-measure of our method outperform other baseline algorithms in three datasets.(2)It is because that the NSEM model can only perform storyline clustering on the news text,and the event representation cannot be extracted.Thus we propose a deep embedded storyline extraction model(DESEM).The DESEM first uses the stacked denoising autoencoder to learn the initial event representation.Then,the clustering loss is used to optimize the model parameters based on the daily data to achieve the fine-tuning effect of the event representation.In addition,we use a fusion layer to perform event learning and storyline constructing simultaneously.We compared our method on three news datasets,and the experimental results show that the precision,recall and F-measure of our method outperforms other baseline algorithms including the NSEM in three datasets.In additional,our approach can extract the implict feature in the news article which can be used for visualization and downstream application.
Keywords/Search Tags:Storyline Extraction, Deep Learning, Event Representation Learning, Clustering Loss
PDF Full Text Request
Related items