Font Size: a A A

A Method Of Patent Retrieval Based On Automatic Query Expansion

Posted on:2014-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:S YangFull Text:PDF
GTID:2248330395489266Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The information retrieval has been more and more important in the modern society for the recent years. As part of the technology achievements, the patent application documents contain almost all of the innovations in every field. The first step to make use of the massive patent information is to search for the patent application documents.Based on the survey of the existing methods for patent retrieve and query expansion, this paper analyzed the features of patent literatures and patent retrieval, and proposed a method of patent retrieval based on automatic query expansion.This method make the pre-processing on the patent documents at the first, and build the domain vocabularies based on an improvement of the original TF-IDF formula. Then we do the shallow parsing to the original query to find the keywords of this query. Determine the domains of this query and the difficulty of the following query expansion, according to the keywords and the domain vocabularies. For the next step we make use of the technique of pseudo relevance feedback (PRF), study the term distribution between the pseudo-relevant documents and the whole collection to generate the candidate expansion term collection, compute the similarity between the original query and expansion terms to rank the expansion terms and reformulate the query with the original query and selected expansion terms. Finally we submit the reformulated query to the retrieval system to get the expanded results.Based on the NTCIR-6test collection we designed some experiments and carry out them to compare the effectiveness of our method and some other existing methods. The experiment proves the feasibility of searching the patent documents with this method. The comparative experiment result shows that our retrieve method achieves better recall and average precision. Our AQE-based patent retrieval is an effective method.
Keywords/Search Tags:Patent retrieve, Domain vocabulary, Domain Matching, Queryexpansion, PRF, Term Distribution Difference
PDF Full Text Request
Related items