Font Size: a A A

Research On Query Expansion For Patent Retrieval

Posted on:2018-02-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:K XuFull Text:PDF
GTID:1318330518971777Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,intellectual property and the patent systems in particular have been extremely presented in research and discussion.With the number of patents increasing rapidly,it is more and more difficult for researchers to capture all the needed patents,which makes the patent retrieval a hot issue in the area of information retrieval.Unlike information retrieval in a general domain,patent retrieval seeks not only to retrieve the most relevant patents,but also to incorporate some patent-specific characteristics to construct the ranking model,which can fulfill the information need better.To incorporate the patent-specific characteristics,one of the effective ways is to adopt query expansion techniques.Query expansion is one of the classic methods in information retrieval field,which can enrich the user's query by adding some useful expansion terms,thus improving the retrieval performance.This study focuses on the analysis and research on the query expansion methods in patent retrieval,mainly from the following three parts:1.Patent query expansion based on multiple text fields.Query expansion technologies are widely used in many information retrieval tasks.In the patent articles,the same word from different context fields may be of different importance for retrieval different from general article,and the fields describe the patent from various aspects.In this work,we explore the possibility and potential of text fields to extract more effective expansion terms.So these fields may be used to weight the expansion terms more accurately.In particular,we propose a two-stage ranking approach for query expansion based on document fields.We explore how to weight the different fields based on their importance to improve the term ranking method for effective expansion terms.We also apply the word embeddings to select the query expansion words,and propose four methods to compute the similarity between original queries and expansion terms in order to improve the performance of query expansion for patent information retrieval.2.Patent query expansion based on multiple information resources.Most of query expansion methods use single source of relevance feedback documents for query expansion term selection.In this paper,we propose a method which exploits external resources for improving patent retrieval.Patent is too special to select query expansion words effectively from relevance feedback documents.We first use semantic dictionaries to compute the similarity between query terms and expansion terms,and use the similarity to modify the expansion term selection method.So Google search engine and Derwent World Patents Index were used as external resources to extract expansion terms,and improve the performance of query expansion in patent information retrieval in terms of precision and recall.3.Patent query expansion based on learning to rank.Learning to rank approach can accommodate many basic query expansion methods as features,which can improve the ranking performance.This paper proposes a learning to rank based approach to improve the performance of query expansion on patent retrieval by optimizing the combination of a set of query expansion algorithms.We explore learning to rank approaches to improve query expansion by combining different methods with different text fields weighting strategies.Different from general learning to rank method,which used different ranking methods as ranking features,we not only take the ranking methods into account,but also use many query expansion methods to generate ranking features.Experimental results show that the patent retrieval performance can be improved when learning to rank approach is used for query expansion.From the studies of the above three parts,we improve the patent retrieval performance using query expansion techniques from different aspects,which can be used to build more effective patent retrieval systems,and facilitate the access to relevant patents for researchers from different domains,thus better grasping the research progress of related fields.Based above researchs,we implement a prototype system of patent search,which can use the experimental data sets and Derwent World Patents Index.We also embed the proposed retrieval methods in the system,and provide the service of patent retrieval,whose performance is examined by practical application.
Keywords/Search Tags:Patent Retrieval, Query Expansion, Learning to Rank
PDF Full Text Request
Related items