Font Size: a A A

Research And Implementation Of Chinese Patent Infringement Retrieval

Posted on:2020-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:S Y ZouFull Text:PDF
GTID:2428330575966265Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The technical content in the patent literature accounts for more than 90%of the global technology.It is the world's largest technical information storage center.Due to the promotion of the market,the number of patents has increased in recent years and patent infringement lawsuits have become more frequent.In order to avoid patent infringement and protect their own patents,patent owners need a reliable and useful patent infringement retrieval system.At present,the existing patent retrieval systems are mostly based on Boolean retrieval model,which only provides keyword matching retrieval.Therefore,it is important to study the detection algorithm that can discover the infringement between patents.Based on the analysis of patent infringement detection and the related work of natural language processing,this dissertation focuses on the infringement detection algorithm based on claims and finally designs and implements a patent infringement retrieval system.In order to achieve this goal,this dissertation first proposes a semantic extension vector space model based on Word2Vec model for patent infringement detection.The algorithm can fully utilize the semantic information contained in the Word2Vec model.It can solve the problem of the weak semantic representation ability of the traditional vector space model.However,the algorithm can't perfectly deal with the patent infringement with inclusion relationship,so this dissertation proposes an infringement detection algorithm based on sentence vector.The algorithm divides the patent claims into sentences,then obtains the vector of sentences based on unsupervised sentence vector generation algorithm.The algorithm constructs the sentence similarity matrix between the two claims and finally calculate the degree of infringement of the claims.Moreover,this dissertation integrates the two patent infringement detection algorithms above,then obtains the final patent infringement detection algorithm and builds a patent infringement retrieval system running on the web.The system speeds up the retrieval process by establishing an inverted index on the patent text feature words.Finally,this dissertation constructs patent experimental data,and verifies the effectiveness of the infringement detection algorithm proposed in this dissertation.By comparing the experimental results,it is found that the patent infringement detection algorithm proposed in this dissertation has a certain improvement in accuracy,recall rate and F1 score compared with the traditional vector space model.The F1 score is increased from 66.95%to 78.11%.This fully demonstrates the superiority of the proposed algorithm.
Keywords/Search Tags:Patent Infringement, Vector Space Model, Word2Vec, Sentence Vector, Similarity Calculation
PDF Full Text Request
Related items