Font Size: a A A

The Improvement Of Textrank And Its Application In Full Text Retrieval In Politics And Law Texts

Posted on:2016-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:2308330464470716Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years, the rapid development of computer science and technology, widely used, people are free to enjoy the benefits of the Internet fast and convenient, but also faced with the dilemma of how to quickly, accurately and comprehensively find desired content in the mass of data. The key word, as a short summary of the text, provides a solution for information management and searching, and therefore is widely used in whole text searching.Based on the studies of key word extraction methods in and outside China, this thesis does a deep analysis of two basic algorithms, i.e. TFIDF and TextRank, and proposes the improved TextRank algorithm on the basis of TFIDF, and designs the realization process for key word extraction using the improved TextRank algorithm. After multiple experiments and assessments, the improved TextRank has increased the accuracy of key word extraction and is more applicable.In the design and realization of Lucene-based whole text searching system model in politics and laws, the key word extraction method, with the improved TextRank algorithm, is applied in the key word extraction module in the system model, and based on this, the thesis also studies the methods of setting up index for different data formats, as well as their searching methods. This has realized the design of whole text searching system model in the area of public security texts, and showcased the detailed process of each sub-system.
Keywords/Search Tags:TextRank, Cut words, Keywords, Index, Full text retrieval, Lucene
PDF Full Text Request
Related items