Font Size: a A A

A Patent Clustering Method Based On The Characteristics Of Multiple Problems

Posted on:2021-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y YangFull Text:PDF
GTID:2518306560953519Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Patent information consists of invention name,abstract,background technology and other information.The background technology text focuses on describing what problems the patent solves and what technology is improved,which is very consistent with the essence of invention and creation to solve the problem.The mining of patent background technology can help technicians locate the core issues of current technology more quickly and accurately,so it is of great significance to analyze the background technology of patent.At present,most of the analysis of patent content is based on the title and abstract information,and the research on background technology is relatively scarce,and the research content is lack of pertinence,unable to list diversified patent information.This thesis takes patent background technology as the research object,completes the problem sentence positioning including problem information and the extraction of user-defined problem triples,and on this basis,proposes a clustering method that integrates the characteristics of multiple patent problems.This research refines the granularity of patent analysis content and further expands the patent research field.The main contributions of this thesis are as follows:(1)According to the patent background technical sentence,this thesis defines the question sentence and the non-question sentence respectively,and proposes the ATT-C-L question sentence location model which integrates the attention mechanism for the phenomenon that the current conventional classification model is weak in feature representation and inaccurate in complex sentence classification.The text features are divided into three categories: convolution features,future and past context features.In view of the fact that different features do not play an average role in text classification,attention mechanism is introduced to capture the most effective information for problem sentence location.(2)In view of the characteristics of patent problems,this thesis proposes to express it in the form of <problem source,problem object,problem word> triple,and to extract three kinds of self-defined rules of "problem source" by fusing the semantic features of relative position.In view of the characteristics of "problem object,problem word" with strong relevance,this thesis proposes five kinds of self-defined grammar rules by fusing the features of relative semantic grammar for joint extraction,Finally,the improved ATT rule is used to modify the boundary of the problem object.(3)Aiming at the problem of single clustering information,this thesis proposes a clustering method which integrates the characteristics of multiple problems.The background technology information of patent is expressed as the binary form of background technology original text and problem tuple,and the patent title information is further integrated,forming the expression method of "patent title-patent background technology text-patent problem tuple".Finally,the spectral clustering method is used to effectively integrate and cluster,and good experimental results are obtained.For the clustering results,we think that the weight of "patent title patent background technology text patent problem tuple" must have different influence on clustering.In this thesis,four experiments are designed to explore the weight relationship of the three.By introducing background technology,the clustering effect is improved by 1.02%,and by introducing the patent problem tuple,the clustering effect is improved by 2.91% based on the former The validity of the patent clustering method proposed in this thesis is clear.
Keywords/Search Tags:Problem unit extraction, Attention mechanism, Multi feature fusion, Neural network, Patent clustering
PDF Full Text Request
Related items