Font Size: a A A

Design And Implementation Of Clusteranalysis And Display System For Patent Literature

Posted on:2020-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q H WangFull Text:PDF
GTID:2428330575957090Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Scientific and technological innovation is becoming more and more important in the world.As a new system of output in technological innovation,regional innovation has attracted the attention of the state and related institutions.In the regional innovation system,it is extremely important to introduce experts from relevant scientific and technologicl fields.The effective carrier for the relevant scientific and technical fields and the detailed information of experts is the patent literature.By analyzing the patent documents,effective expert information can be found.As an effective patent document analysis tool,patent map can extract,classify and analyze patent document information with complex structure.The results are presented to the user in the form of a visual chart.This paper mainly realizes the system consisting of patent word cloud,inventor-institution relationship diagram and patent clustering scatter plot.It plays an important role in the analysis of effective talent information and the assessment of the status quo and direction of technology development.The main research contents and results of this paper are:(1)Studying the information structure of the patent literature,determining the extracted fields.Studying the information source of the patent information and the method of crawling using the crawler,and storing the data in the database.(2)Preprocessing the patent data,including word segmentation,stop word filtering,and word drying,to prevent the proposed words from having relevant scientific and technological significance and causing interference to the clustering results.(3)The N-gram-based feature selection method is used to extract the keywords of the document,tf-idf and Word2Vec are combined to calculate the text vector,and the vector matrix of the patent document is obtained,which provides input for the subsequent clustering algorithm.(4)After exploring various clustering algorithms,using AP clustering algorithm combined with merge process,clustering patent documents,visualizing clustering results in the system with patent word cloud and inventor-related institutions together form the core module of the system.This paper designs and implements a patent map system that can deeply explore the patent inventors and related scientific and technological fields.It is divided into user registration and login module,patent data display module,patent word cloud module,patent map module and user management module.The test proves that the system can better analyze the inventors and the information in the field of science and technology,and has certain application value and space.
Keywords/Search Tags:patent map, text clustering, word2Vec, data processing, system design implementation
PDF Full Text Request
Related items