Font Size: a A A

Extract Topical Keyphrases From Chiniese Text Corpora

Posted on:2018-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2348330533466151Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Thcre is a dramatic increase of information in the era of big analytic, which text information can be the most the common type which people upload or download it everyday, One of the most important method of mastering those information is to know its keywords, however, people often disregard the two important aspects of the methods regarding the calculation of keyword extraction,including the length of a word and the topic of text. In this essay, the above two aspects were discussed. The research is for the method of calculating keywords extraction for the topic and key words in Chinese, which can be presented as followed.1. Propose a new calculation method of exploring the topic and word in Chinese text, which can be illustrated from three perspectives:(1) Based on the model of KERT, the calculation of combination of LDA topic model and frequent phrases can be achieved, so that the keyphrases hiding behind the topic can be extracted,and the size of the candidate set of words can be minimized;(2) The Rank algorithm is further enhanced in this essay, which not only delete the incomplete candidate phrases, but avoid extracting keyphrascs and their subphrases existing in the same result of abstraction. After that, the candidate phrases of different length can be selected and ranked;(3) he proposed method in this essay can extract the Chinese key words both in phrases and words, and they can be used in either short or long texts.2. be essay studied and researched the calculating keyphrases extract for the topic in Chinese.The research findings include two aspects. Firstly, it can meet the requirement for people to understand those key words, which proved to be more effective than that of KERT, because it can avoid extracting key words and phrases existing in the same result of abstraction; secondly, it has proved that precision and F1 evaluation of extraction are more precise compared with KERT.
Keywords/Search Tags:extrace keyphrases, KERT algorithm, LDA topical model, frequent phrases, Rank algerithm
PDF Full Text Request
Related items