Font Size: a A A

Research And Application Of Text Classification Algorithm In Patent Field Based On Knowledge Graph

Posted on:2021-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:J L HuangFull Text:PDF
GTID:2428330626958929Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Under the background of the innovation-driven fourth industrial revolution,China is accelerating the construction of an innovation-oriented country and actively encouraging the public and enterprises to innovate in their respective fields.The public needs to borrow a lot of knowledge and technology related to research while innovating.To grasp the current technology development and innovation in related fields is the prerequisite to grasp the current hot issues.If you want to promote the development of your field,it is important to understand the innovation in related fields.Patent is not only the bellwether of advanced scientific and technological achievements,but also the carrier of cutting-edge knowledge,prompting enterprises,universities and other technological innovators to continuously improve their own capabilities and improve the technical system.Under the influence of the country's active promotion of mass innovation and innovation-driven development,the number of Chinese patents has also increased rapidly year by year.Faced with such a diverse source of information,how to effectively obtain relevant knowledge for enterprises and innovative talents has made our mind to think deeply.In the huge amount of data,effectively classifying the information in the patent field can enable the innovative talents of enterprises and universities to accurately obtain the information they want from a large amount of text data.Therefore,efficient and accurate information classification can greatly reduce the search time for popular scientific and technical information and improve retrieval efficiency.Patent is a label representing the development process of science and technology.At present,China actively encourages enterprises and university researchers to actively innovate.How to efficiently search for favorable patent information in many kinds of patents is the main problem facing us.This article builds a patent-oriented knowledge graph,and finds connections between patents in the same category through structured relationships in the network of associations between different patents.The task of patent classification is transformed into the task of short text classification of patent abstracts,thereby improving the classification effect by virtue of the structured features of the patent knowledge graph.In this paper,the classification effect is improved as follows:1.Construct the framework of knowledge graph in patent field.This paper first constructs the domain ontology and completes the definition of the data pattern in the knowledge graph.Secondly,by analyzing the characteristics of patent data,the entityof patent knowledge graph is defined,and the entity attribute is extracted.Finally,manually define the relationships between the entities.2.Extended patent abstract short text feature.This paper uses the improved TextRank algorithm to extract keywords,and uses the OwnThink knowledge graph to expand the synonyms and subordinate words of the keywords.3.Improved TextCNN algorithm based on patent knowledge graph.In this paper,by extending the text features and using the TransE model to represent entities and relationships in the patent knowledge graph constructed by learning,patent semantic information is represented as a dense low-dimensional real-valued vector.In the experiment of the TextCNN algorithm,this paper stitches the feature vectors output from the patent vector in the input vector matrix of the TextCNN algorithm through the pooling layer,and expands the text features to improve the accuracy of the TextCNN algorithm.4.This paper establishes a patent semantic retrieval platform based on patent knowledge graph.This platform is the practical application of the patent knowledge graph constructed in this paper.Patent classification is automatically classified by an improved TextCNN text classification algorithm.At the same time,the platform also designed patent search,patent classification search,knowledge graph management,patent management and other functional modules.Experiments show that this paper extracts patent keywords,finds related words with OwnThink knowledge graph,and constructs patent knowledge graph with patent,author,applicant and other entities,which increases the relevance of sentence context.On the basis of constructing the knowledge graph of patents,the accuracy and recall rate of the text classification algorithm are improved obviously.At the same time,the patent semantic retrieval system based on knowledge graph is realized,and the accurate classification provides guidance for the public and enterprises on the path of scientific and technological innovation,improves the retrieval efficiency of innovative talents,and greatly reduces the time required for the relevant scientific and technological knowledge reserve.
Keywords/Search Tags:Patent, Knowledge Graph, Express Learning, Text Classification
PDF Full Text Request
Related items