Font Size: a A A

Evaluation Method Research Of Automatic Summarization Calculating The Similarity Of Text Based On HowNet

Posted on:2012-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhangFull Text:PDF
GTID:2178330338993802Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Diversification of web information and complication of web content bring inconvenience to most users. To solve this problem, a growing number of researchers have started studying the techniques of automatic summarization, and they devised a lot of new abstract systems. However, evaluation of automatic summarization is a very complex issue, involving areas such as linguistics, psychology, artificial intelligence and other disciplines, and there are still many difficulties about its implementation. So far evaluation of automatic summarization has not a unified standard, which also enables evaluation of automatic summarization to have a great research value and become the challenging problem. Especially, there are the lack of a unified large-scale test set and evaluation platform in the field of multi-document Chinese summarization evaluation, which severely restricted the development of the Chinese summarization. Therefore, it needs an accurate and effective method to evaluate the performance of the summarization system in the field of automatic summarization, which guides the specific research work.For these issues, in order to do automatic evaluation for summarization much more accurately and efficiently, this paper analyzed the present methods on automatic summarization evaluation concretely, and pointed out defects of these evaluation methods. It presented a new evaluation method of automatic summarization based on the vector space model. It analyzed the meaning of words concretely using HowNet in the vector space model, considering the effect of part of speech serving as role in the sentences when calculating the weight of feature items and improving the formula of feature items weight.In order to verify the segmentation effect, this paper realized the procedure of statistical segmentation. The experiment shows that good effects on word segmentation needs to choose a good dictionary. As the method proposed in this paper, it designed automatic summarization system based on word frequency and automatic summarization system calculating the similarity of text based on HowNet, which achieved a new method proposed in this paper. The experiment shows that evaluation result of this method is better than that of P/R and the method based on the similarity of text.
Keywords/Search Tags:Automatic summarization, Evaluation measure, Similarity, vector space model, Sentence weight calculation
PDF Full Text Request
Related items