Font Size: a A A

Research On Simultaneous Text Summarization And Keyword Extraction Based On Hypergraph

Posted on:2017-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:P MoFull Text:PDF
GTID:2348330488485675Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, there have seen an explosive growth of online information. How to efficiently and effectively obtain the information has become an important research problem. There are two main kinds of technology to complete the task, text summarization and keyword extraction. Under the advanced information retrieval and natural language processing technology, text summarization and keyword extraction have become research hotspots in recent years.Text summarization and keyword extraction are two important research topics in Natural Language Processing (NLP), and they both generate concise information to describe the gist of text. Although these two tasks have similar objective, they are usually studied independently and their association is less considered. Following the graph-based ranking methodology, some collaborative extraction methods have been proposed, which considered the association of sentences, words and the relationships between sentences and words, and generated both text summary and keywords in an iterative reinforced framework. However, most existing models are limited to express various kinds of binary relations between sentences and words, which ignore a number of potential important high-order relationships among different text units. Because of these, we propose a new collaborative extraction method based on hypergraph. In this method, sentences are modeled as hyperedges and words are modeled as vertices to build a hypergraph, and then summary and keywords are generated by taking advantage of higher order information from sentences and words under the unified hypergraph. Experiments conducted on the Weibo-oriented Chinese news summarization task in NLPCC 2015 demonstrate that the proposed method is feasible and effective.Based on our proposed collaborative extraction method on hypergraph, we implemented an automatic summarization and keyword collaborative extraction system which orients online news. Firstly, the system can capture hot news from top charts of Sina news center in real-time. And then it can generate summary and keywords for each news, and display them to users in a brief way at the same time. Browsing the title, keywords and brief summary, users can obtain the main information of news quickly.
Keywords/Search Tags:hypergraph model, document summarization, keyword extraction, Collaborative extraction
PDF Full Text Request
Related items