Font Size: a A A

Research On Patent Retrieval And Core Patent Identification Methods

Posted on:2022-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:G R ChenFull Text:PDF
GTID:2518306752497504Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The patent is important wealth of knowledge.By exploring patents,we can find important technical details and relationships which can provide valuable information to formulate research and development strategies.Therefore,it is necessary to carry out researches on patent retrieval and discovery of core patents.However,as there are more and more patents,we will pay more for patent retrieval and will meet more and more difficulties in recognizing core patents because of the features of patent description texts,such as lengthy patent texts and technical and various technical and legal terms in patent descriptions.In order to improve the retrieval quality and accuracy,we can use core patents to deduce future development trends of key fields.This paper takes patent texts as research objects and carries out researches from the aspects of patent query expansion,patent text similarity and core patent identification.The main contents of this paper are shown as follows:1.In order to obtain intact query intention of users,improve the accuracy and the recall rate of retrieval and query,this paper puts forward a patent query expansion method based on community discovery.This method establishes a patent keyword graph with patent keywords as the node and the relationship between keywords as the edge and it transforms the patent query expansion problems into the dense sub-graph search problem.Besides,it uses the community discovery algorithm to solve related problems.Compared with the benchmark method of query expansion,this method is feasible and effective for patent data set CLEP-IP 2010,which has been proved by experimental results.The recall rate of the method has improved for 7.1% and PRES has improved for nearly 3.2%.2.In order to improve the accuracy of patent retrieval,this paper puts forward a context sensing patent retrieval model based on BERT.This model integrates multiple types of contents in patent texts and uses BERT to acquire patent text characteristics.It takes into the relationship of patent contexts when extracting matching information,uses RNN to encode the contexts,extracts dependency relationship between terms through convolutional networks and finally obtains the similarity matching value to measure patent texts.Experimental results show that this method is feasible and effective.When doing experiments on the patent data set CLEP-IP 2010,it can all achieve better results than those benchmark information retrieval models in terms of p@20 and n DGC@20 and can improve the efficiency of 5.3% and 9.1% respectively.This further demonstrates that the contextsensing patent retrieval model can better construct accurate semantic representations of patent texts.3.In order to determine patents which can represent dominant technologies in corresponding technology fields from a large number of relevant patent documents,this paper puts forward a core patent discovery method based on the minimum cost connected dominant set.First of all,the patent information mixed graph is constructed to show relationships among patents and core patent problems discovered are modeled as minimum cost dominating set problems.Besides,the optimization algorithm IBPSO of binary particle swarm optimization based on the immunologic mechanism is used to find the minimum cost set and to obtain core patent sets.The results of experiments done on real patent data sets showed that the recall rate of this method can realize 83% of the search task of top@50.The experimental results show that this algorithm is feasible and effective.
Keywords/Search Tags:patent search, query expansion, text similarity, BERT, the minimum cost connected dominant set
PDF Full Text Request
Related items