Font Size: a A A

Research On The Identification Method Of Industrial Technology Evolution Path Based On Data Mining

Posted on:2022-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:M X WangFull Text:PDF
GTID:2481306743478044Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
At present,China’s industrial economy as a whole is in a positive development trend,but the problem of uneven industrial development among regions persists.The development of industry depends on the progress of technology,and the evolution of technology in the process of industrial development follows certain trajectories,and these trajectories provide a lot of intelligence information for subsequent research plans and industrial policy formulation.Most of the existing technology evolution identification methods analyze and study one technology,but there are few studies that analyze its internal technology evolution relationship from the industry as a whole,so it is of high research value to mine and analyze its internal technology evolution path for the industry as a whole.In this thesis,we propose a data mining-based industrial technology evolution path identification method for the research scenario of large-scale patent literature data,and use computer science and technology to analyze and process a large amount of patent literature to explore the internal technology evolution path of industry,so as to guide the regional industrial technology positioning and future development direction planning.This thesis mainly focuses on two aspects of technology clustering and technology theme association,and the research paths are divided into flexible clustering based on data enhancement and technology evolution path identification based on theme extraction,and the main research contents are as follows.For the problems of low utilization of low frequency words,ambiguity of multiple meaning words and unknown number of clusters in the clustering process of patent documents.In this thesis,a pre-training model is used in the text representation part to enhance the original data and optimize the low-frequency word word representation performance by incremental training.And in the text vector generation part,we introduce coarse classification labeling features to optimize the TF-IDF algorithm to achieve dynamic coding to solve the problem of multiple meanings of words.After that,we construct the data co-occurrence frequency matrix based on the results of multiple clustering in the text clustering part,and transform the problem of determining the number of clusters into the co-occurrence frequency problem with text semantic information,so as to achieve the purpose of flexible clustering on demand.In the comparison experiments with existing methods,the text representation method using data augmentation coding achieves better semantic coding results.The flexible clustering method can better differentiate data and obtain clustering results with high intra-class similarity and inter-class differentiation than the clustering method that determines the number of clusters,and the feasibility and accuracy of abstracting the number of clusters by co-occurrence frequency is verified.In the process of technical subject identification and association relationship construction,the problem of technical subject extraction and association degree calculation of patent documents is addressed.In this thesis,we firstly analyze the implied technical topics of patent clustering results by using LDA model,extract representative technical nodes based on the probability distribution of topics,and then calculate the similarity between nodes and identify the association relationship between nodes.Secondly,in order to expand the information of technology nodes and increase their readability,this thesis uses IDF to optimize the Text Rank algorithm to achieve automatic screening of keywords.Finally,the evolutionary paths of plastic packaging industry in Dongguang County,Hebei are identified and aggregated to form an evolutionary network.The feasibility and effectiveness of the method proposed in this thesis are further empirically demonstrated.
Keywords/Search Tags:Technology evolution, Data mining, Data enhancement, Text clustering, Topic association recognition
PDF Full Text Request
Related items