Font Size: a A A

Semantic-based Automatic Abstracting System

Posted on:2012-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:J JiangFull Text:PDF
GTID:2208330332986685Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The number of electronic information has grown exponentially in today's society. The research on such issues as information filtering and concentrating becomes necessary. The utility of automatic summarization will significantly reduce the cost and provide convenience for people efficiently.The thesis presents an automatic summarizing method based on semantic features. First, we abstract the training text artificially, mark the features and get the Na?ve-Bayes classification by training. In the summarizing stage, the system extracts the sentences' features of the text, and selects candidate sentences by classifying process. At last, it reduces the redundancy of candidate sentences and gets the final summary.In the aspect of sentences' features extraction, we introduce the sentence feature based on lexical chain and sentence relation network. Lexical chain, which provides a representation of the structure of lexical cohesion in a document, is a collection of related words. We recognize the textual lexical chains, calculate the weights of them and assign the relative sentences' feature. Sentence relation network reflects the relationship between sentences. We construct the textual sentences' relation network, take each node's complex network parameters as the features of relative sentence.In the aspect of redundancy elimination, we introduce the clustering process to abstract generating and filter the redundancy sentences. In order to enhance the clustering efficiency, we present a clustering algorithm based on rough classification, which introduces classification method to clustering process.The thesis discusses the automatic summarization method based on semantic and presents improved algorithms of sentences'feature extraction and redundancy elimination. The experiments demonstrate better practical effect. Finally, it is the overall design and implementation of main modules of our system.
Keywords/Search Tags:automatic summarization, lexical chain, sentence relation network, classification, clustering
PDF Full Text Request
Related items