Font Size: a A A

Chinese Science And Technology Literature Automatic Abstracting System

Posted on:2007-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiFull Text:PDF
GTID:2208360185456068Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet, People need information compression tools to epurate and condense information, such as the Automatic Summarization System. People use computers to deal with a grate deal of text information in this way: firstly we create summarizations which reflect the basic subjects of texts. Then people can read a few of abstracts and decide whether reading the whole text or not. It is a more effective way of getting information.This dissertation is an exploration of Chinese Automatic Summarization System. Firstly, The dissertation introduced the definition and significance of extracting summarization, the sorting and the writing rules of summarization, the background and aims of our task . Then we made a comparison and analysis with the several main formal models and methods of system,such as basing on statistics,basing on meaning,basing on concept,basing on knowledge etc, the dissertation put forward a summarized method combined statistics with some semantic relations.This method combined the technology of statistics based on with the technology of meaning based on. It aimed at texts of science and technology. It split words and labeled the part of speech first, and then it computed the frequency of words. Using the semantic relation in HowNet, it computed the similitude degree to combine the synonymousness. Then it computed the weight of words in virtue of the stop list and science and technology field words thesaurus. According to the order of words weight, it extracted the characteristic words which express the main content of text. According to the physic information of sentence in the text and the information of the characteristic words which the sentence included, it computed the weight of sentences. It got the candidate sentences of summarization in the light of sentence weight. It eliminated the repetition of the candidate sentences using VSM. Then it enhanced the consistency of summarization and cleared up pronouns in the sentences using some measures. Finally it output the summarization sentences as theirs order in the text.Finally the dissertation discussed the evaluation methods of system and the results of the experiments. The results of the experiments prove that the method we proposed is...
Keywords/Search Tags:summarization extraction, characteristic word, semantic relation, HowNet, VSM
PDF Full Text Request
Related items