Font Size: a A A

Research On Query-directed Multi-document Summarization

Posted on:2009-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:W ShaoFull Text:PDF
GTID:2178360245957394Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of Internet provides a huge amount of resources for people, also promotes the technology of information processing. Through information processing technology, it can help people organize, summary and analyze of various resources on the network more effectively. The Information Retrieval and Automatic Summarization are the most crucial technologies among them. While the Information Retrieval is an effective way to acquire the required information, and the Automatic Summarization can reduce the burden of reading, help people extract the main relevant information. It favors the Information Retrieval and re-processing, the simplicity and clarity of whose performance is an effective means for information mining.The paper focuses on the technology of Query-directed Multi-document Summarization. It is a hot research topic, whose goal is to produce a brief, well-organized, fluent description according to the given query from relevant documents, help people judge and brown the interested information, and improve the efficiency of information acquirement. Based on current research, a Query-directed Multi-document Summarization System is designed and realized while considering both query information and themes of relevant document set. The primary research can be summarized as follows:1. A sentence extraction method based on feature inosculated is proposed. As Query-directed Multi-document Summarization should be both a "compressed version" of the document cluster and satisfy the user's need, we evaluate the importance of each candidate sentence based on exploiting both the power of correlation with the query and the power of global connectivity. It guarantees the summary highly relevant to the query and representative of the documents at the same time. At last, this thesis adopts an improved MMR for reducing redundancy. Random experiments show: The validity of the proposed approach performs good than the method just depended on the importance of candidate sentences.2. In the process of summary sentences selection, two optimization strategies are adopted: In order to obtain the correlative feature with the query, we express the candidate sentence based on concept. Then we mine the global correlative feature of each candidate sentence using semantic graph, which has the advantage of judging the global correlative feature for each node more precisely, intuitively. The evaluation results on DUC2005 task shows the influence of two features inosculated weigh tuning.3. The paper realizes an English Query-directed Multi-document Summarization System. In order to improve the validity of sentence expression, it integrates all technologies such as keyword stemming, identify reference chains and synonym for merge. In the search stage, it uses the sort approach based on density analysis. At last, it has a corresponding try from the construction of test dates to the methods for evaluation. Which could not only verify the feasibility of the method, but also provides a good foundation for analyze.
Keywords/Search Tags:Query-directed Multi-document Summarization, Multi-document Summarization, Sentence Selection Method Based on Feature Inosculated, MMR
PDF Full Text Request
Related items