Font Size: a A A

Research And Design Of Text Summarization System In The Network

Posted on:2015-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:R YangFull Text:PDF
GTID:2308330479975966Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of Internet, human beings have more and more dependence on the network for getting information. But how to find the useful infomation quickly from huge resources is a big problem. Automatic text summarization technology which could generate the summary of web text by computer could solve the problem for people.Automatic text summarization technology has been taken attention by domestic and foreign scholars in recent years. This paper studied this technology based on data mining, machine learning and artificial intelligence technology. The main research and work are as follows:1) Previous automatic text summarization algorithms based on optimization is mainly optimizing the weights of sentence features. Unlike previous research, this paper optimized two combinatorial problems: the weights of summary features and the sentences. The core idea is: firstly, conclude the features of text summary; then, optimized the weights of summary features using genetic algorithm; finally, optimized the sentences using particle swarm optimization algorithm. Through the experiments, it showed that this algorithm had precision, recall and F-value at 0.4849、0.4843、0.4894 and 0.5998 、 0.8556 、 0.7052 respectively in the compression ratio 20% and 30%, and the acceptable degree reached 0.75 and 0.8, which was better than the other algorithms.2) The author of this paper researched the text syntactic features and found that the complex relations between sentences could be regarded as the properties of complex network. Unlike previous research, this paper took the community idea into text topic patition, and proposed five different extracting methods for extracting text summary. In the exprimental process, this algorithm had precision, recall and F-value at 0.5032, 0.5365, 0.5193 and 0.6503, 0.8209, 0.727 respectively in the compression ratio 20% and 30%, and the acceptable degree reached 0.8 and 0.85. In comparison with other algorithms, this algorithm had obvious advantages in the recall, it showed that this algorithm could extract the more overall summary of the text.3) Finally, this paper developed an automatic reply system based on the research production. The core technology of this system is the automatic text summarization algorithm. The higher accuracy the automatic text summarization algorithm has, the more intelligent and valuable the system could be.
Keywords/Search Tags:Automatic text summarization, Optimization algorithm, Complex network, Segment algorithm of community division
PDF Full Text Request
Related items