Font Size: a A A

Pronoun Resolution Based Research On Automatic Summarization

Posted on:2013-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:F J LiuFull Text:PDF
GTID:2248330371999439Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the continuous advances in computer technology and the rapid popularization of Internet, all kinds of information on the Internet is growing explosively, especially in recent years. However, not only does the Internet offer us rich mass of information, it meanwhile brings us a very big trouble and challenge for the goal of searching the most useful information quickly and efficiently.Automatic summarization technology is one hot research topic in the field of Natural Language Processing, which is mainly to study how to make the computer to extract automatically smaller amount of information that represents central idea from natural language text. It’s essence is to mining central information and to concentrate the central idea in few sentences. The dissertation narrates four main models and methods:Automatic text summarization, structure-based automatic summarization, understanding-based automatic summarization and information extraction.As another branch of NLP-anaphora resolution, it plays a very important role on many applications such as machine translation, automatic abstracting, automatic question answering system, multi-language processing technology, which is one of the most important and difficult problems in NLP research. In recent years, researchers of anaphora resolution area are focusing on syntactic analysis and corpus-based research method. Among corpus-based research method, this paper narrates four main approaches:rule-based method, statistical method, classification-based method and the research programs with Chinese characteristics.This paper introduces the definitions, classifications of automatic summarization and anaphora resolution, then gives a brief introduction on the status of domestic and international research in the field of the two areas. In this paper,the related research work is listed as follows:1. An new algorithm is proposed which integrates multi-features for anaphora resolution of Chinese;besides another new idea which applies S-V and V-O Co-occurrence statistic in a limited window,is proposed to improve the algorithm.2. Paragraph Segamentation by Topic and Discourse structure analysis are integrated to improve automatic abstract results.3. A simple automatic abstract system on finance area is developed in this paper, and the experiments show that the new algorithm improves accuracy and comprehensible degree of abstracting to some extent.Through the work of experiment, it inspires us that we can go deep in four directions to improve the research, and the four directions are:noun phrase identification, expansion of the relationship between words and concepts, granularity dividing on concept, limited sliding property and variability window.
Keywords/Search Tags:anaphor resolution, automatic abstract, discourse structure, multi-factors, syntactic analysis
PDF Full Text Request
Related items