Font Size: a A A

Research On Patent Citation Recommendation Problem Based On Semantic Heterogeneous Information Network

Posted on:2022-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:S LiFull Text:PDF
GTID:2518306542463314Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of technological innovation,the protection of intellectual property has become more and more important.Patents,the main form of intellectual property,have been driving a remarkable increase in the number of applications and grants annually.In the patent filing process,both applicants and examiners need to identify patent documents related to the novelty of the patent application.However,facing the increasing number of patents,it is more and more difficult for applicants and examiners to find the appropriate patent citation effectively and accurately.Patent citation recommendation is an indispensable solution to overcome this difficulty.It focuses on automatically searching the list of patent citations related to the target patent from the patent database for recommendation.The core of patent data is the textual content,which describes the technical ideas and the scope to be protected.Therefore,most of the existing patent citation recommendation methods focus on content-based method,such as using keyword search or text semantic acquisition to recommend.In recent years,heterogeneous information networks have been widely used in recommender systems.Some scholars introduce patent structure information to build heterogeneous information network.However,the existing methods are all based on the existing structure and relationship to build heterogeneous information network,which will ignore the potential similarity between patent objects,resulting in the similar patents are far away in the network.In this paper,the potential semantic information and the existing patent structure information are related to better learn the structure and semantic information.Specifically,this paper uses text similarity and topic similarity to obtain patent semantic relations,and the semantic relationship is effectively integrated with the patent structure information.Then,we use network representation learning method to learn representation to recommend patent citations.In addition,the text also considers the text characteristics of fixed hierarchy and heterogeneous structure to further improve the effect of patent citation recommendation.The main works of this dissertation are as follows:1)Aiming at the potential semantic relationships in heterogeneous information networks,this paper introduces a Semantic Based Heterogeneous Information Network Embedding for Patent Citation Recommendation(SHINE).First,based on joint similarities of topics and textual content between patents,we get new type relation,semantic links,to obtain semantic information.Secondly,we construct a novel heterogeneous information network with bibliographic information and semantic links to integrate semantic and structural information,and we use network representation learning of Skip-Gram model to map the two kinds of information into a common vector space.Finally,we provide a list of relevant citations by linear combination of multi-modal similarities for recommendation.We conduct experiments on two real patent datasets,USPTO-A and USPTO-B.SHINE method is obviously superior to comparison methods in AP,AUC and recall.2)According to the characteristics of patent text information and structure information,we propose a Hierarchical Semantic Based Heterogeneous Information Network Embedding for Patent Citation Recommendation(HSHINE-PCR)method to further extend SHINE.Firstly,we obtain the fixed hierarchy vector according to the characteristics of the fixed hierarchy structure of patent textual content,and semantic links are obtained.Secondly,the semantic relationship and the patent structure information are fused together to build a heterogeneous information network.We use network representation learning method of heterogeneous Skip-Gram model to map the two information into the same low dimensional space.Finally,the linear combination of multi-mode similarity is also used for patent citation recommendation.On the basis of USPTO-A and USPTO-B,a larger order of USPTO-C dataset is added and our method has achieved better improvement.
Keywords/Search Tags:Patent citation recommendation, Heterogeneous information network, Network embedding, Semantic links
PDF Full Text Request
Related items