Font Size: a A A

Automatic Micro-Blog Generation And System Implementation Towards News Document

Posted on:2016-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhangFull Text:PDF
GTID:2308330461972486Subject:Software engineering
Abstract/Summary:PDF Full Text Request
A huge amount of online news are generated every day. So it is not easy for regular users to browse news via mobile phone. It is really time consuming for regular users to select their interested content from such large number of news information. Therefore, a compression tool is necessary for them to refine and concentrate these information. Summary of textual information can not only compress textual messages, but also provide other treatments, such as text information storage, information retrieval and data mining. After studying the technologies of news summary, in order to improve the efficiency of viewing and choosing interested information, we propose a new concept "automatically generated microblogging of news". The contributions of this paper are shown as follows:First, a new concept of automatically generated microblogging of news is proposed. After generating themes of the articles with less than 140 words, regular users just need to read a small number of micro-Bo, This would greatly improve the efficiency of getting electronic text messages.Second, we explored to study the automatically generated system about Chinese microblogging documents. After studying the features of several popular technics, such as: statistics-based automatic summary, natural language understanding based document summary and structure-based automatic summarization. Microblogging are automatically generated with key phrases, which are generated based on the choice of summary sentences as microblogging.Third, this approach recognizes the important sentences to create semantic microblogging, with higher scores sorting and different from each other sentences. The main contents of the document attempts to create greater coverage with little redundancy microblogging. Microblogging results, which are generated by different weight models are deeply analyzed.Fourth, an automatic microblogging news document generation system is built, where two methods of automatically generated microblogging have been studied. On the one hand, a statistical method is used to generate the key concept, which is based on the summary of documents. Then, based on the relationship between key concepts and sentences, concluding sentences are taken as microblogging. On the other hand, based on latent semantic analysis, potentially important sentences are extracted as the microblogging.
Keywords/Search Tags:Automatic micro-blog generation, Latent semantic analysis, Singular value decomposition, Key phrases extraction, Automatic micro-blog evaluation
PDF Full Text Request
Related items