Font Size: a A A

Research And Application Of Patent Mining Algorithm Based On Topic Model

Posted on:2021-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y T SunFull Text:PDF
GTID:2428330647455399Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Patent is an important manifestation of intellectual property rights.In-depth mining of massive patent data to obtain technical information will help promote the protection of intellectual property rights and the re-innovation of patents.The text topic model is an important part of data mining,which is applied to the extraction of topic information in different fields.At present,in the field of patent analysis,the subject of patent technology is directly obtained by subject extraction from the patent text through subject models,which will cause the subject information to be extracted to be too messy,unable to express the subject meaning well,and unable to obtain fine-grained technical subjects.The International Patent Classification(IPC)number of a patent represents the technical field of the patent and contains the technical information of the patent.Therefore,this article uses IPC combined with the traditional theme model to extract the subject of the patent text,which can obtain the patented technology more accurately and clearly Subject information,and apply the excavated technical subject information to the evolution of patent technology,so that researchers can better understand the development status of patents in a certain field.This thesis conducts research from the perspective of patent global technology evolution,topic mining method from the perspective of patent fine-grained technology evolution,and patent technology topic evolution analysis visualization system.The main work of this thesis is as follows:(1)From the perspective of patent global technology evolution,in view of the problem that the meaning of the extracted theme information is not obvious in the traditional technology theme extraction method,a patent theme mining algorithm based on Latent Dirichlet Allocation(LDA)is proposed.The algorithm uses the IPC number to initially divide the text set,and then uses the LDA topic model to extract the topics of the divided text sets and merge similar topics to obtain the topic information results.Finally,the experiment verifies that the method is compared with other traditional topic mining method has a better performance in expressing the meaning and accuracy of the topic.(2)From the perspective of patent fine-grained technology evolution,in view of the problem that traditional technical subject extraction methods cannot obtain fine-grained technical subject information,a patent subject mining algorithm based on Partially Labeled Dirichlet Allocation(PLDA)is proposed.This method sets different IPC number levels as patent labels,uses the PLDA model to perform topic mining on patent texts,and obtains topic information at different levels under the IPC.Experiments verify that this method is fine-grained expression compared to other traditional topic mining methods has a better performance in terms of theme meaning and accuracy.(3)In order to facilitate the visual analysis of patents by researchers,a visualization system for the analysis of patent technology subject evolution based on subject model is proposed,which includes a data collection module and a subject mining evolution module.This thesis conducts the mining and evolution analysis of topic information from the global perspective and the IPC number perspective,displays the evolution of patent technology topics through visualization technology,which can help researchers fully understand the development trend of patent technology topics in a certain field.It further provides a basis for patent research.
Keywords/Search Tags:Patent, Topic model, Text analysis, Topic mining, Technology evolution
PDF Full Text Request
Related items