Font Size: a A A

Application Of Text Mining In Patent Literature Analysis

Posted on:2020-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y X WangFull Text:PDF
GTID:2428330590979011Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Protecting intellectual property rights is crucial to the healthy development of the economy and the growth and strength of enterprises.Patent application is an effective measure to protect intellectual property rights.The quantity and quality of patents are important indicators for measuring the innovation ability of enterprises.Although each patent document has its detailed IPC classification number,the patent literature has a large number,rapid growth,and unstructured text.Traditional statistical analysis methods are difficult to find the technical information and knowledge implied in the patent literature,and the value of patents is not fully reflected.Text mining technology provides the possibility to deeply analyze patent documents.Using text mining theory and tools to analyze patent documents can effectively navigate the company's innovative research and development,and help companies improve their innovation capabilities and core competitiveness.This paper discusses the application of text mining technology in the analysis of patent documents by processing and analyzing patent texts.In the master's research,text clustering is selected as the entry point.For the problem that the traditional text similarity calculation method is not accurate,a text distance calculation formula W2v_dist based on Word2 Vec is proposed.The stability and accuracy of the traditional clustering algorithm are lacking.This paper combines the firefly algorithm and W2v_dist to propose a new algorithm K-OFA that combines the firefly algorithm and K-Medoids.Finally,a patented text mining system is designed and implemented.The main research results of this paper are as follows:(1)Combining the theory and method of text mining,this paper discusses the application scenarios of text mining technology in patent literature analysis.(2)For the problem that the semantic similarity between texts cannot be well measured,this paper proposes an improved text distance calculation formula W2v_dist based on Word2 Vec tool,combined with LDA topic model and EMD distance.(3)The traditional firefly algorithm has the problems of slow convergence and premature maturity.Therefore,this paper optimizes the firefly algorithm,and uses the analogy idea to apply the improved firefly algorithm to text clustering.(4)This paper designs related experiments and tests the clustering effect of the improved algorithm proposed in this paper.(5)Based on the.NET Framework platform,a patent text mining system is designed and implemented.Taking the specific functional modules as an example,the system development process based on the 3-tier architecture is briefly described.
Keywords/Search Tags:patent, text mining, text clustering, .NET Framewrok, Word2Vec, firefly algorithm
PDF Full Text Request
Related items