Font Size: a A A

Research And Design Of Chinese Patent Text Classification Based On Deep Learning

Posted on:2021-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:H X DuFull Text:PDF
GTID:2428330611488445Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of society,the number of patent applications is increasing,and the patent literature contains a large amount of technical information on invention and creation.The use of the science and technology in the patent literature can greatly reduce the development cost and development cycle,so how to obtain rich scientific and technological information from the patent has become the focus of people's concern.At present,people usually use semi-automatic classification to assist patent classification personnel in their patent classification work.Although it reduces the workload of patent classification personnel to some extent,this method still has some deficiencies.With the development of deep learning in natural language processing,it provides technical support for automatic classification of patent text.This paper uses deep learning method to realize a relatively efficient text classification method through model design.The main work is as follows: first,it is to design the web crawler strategy and use Python programming language to obtain Chinese patent text data.,and constructs the training set and test set of classification model to provide data support for Chinese patent text classification.Second,in text preprocessing,the stuttering word segmentation system is adopted.In addition,the self-established domain user dictionary is added for word segmentation.After word segmentation,a custom deactivated dictionary is used to remove some words that are not important to the classification task.The third is elaborated the convolutional neural network(CNN)and length(LSTM)memory neural network theory knowledge,in constructing Chinese patent text classification algorithm,combined with CNN extract local features and the advantage of global BiLSTM serialization toextract features,and introduce Attention mechanism BiLSTM hidden layer(Attention mechanism),BiLSTM_ATT_CNN combination model is put forward,and the combination model can better complete patent text classification task.The BiLSTM_ATT_CNN combined model has better classification effect than the other four models.The fourth is to be able to realize Chinese patent automatic text classification,design the Chinese patent text classification system,analyzes the demand and purpose of the system,design the function of the system architecture and technical architecture,data collection and the detailed design the Chinese patent text,text preprocessing,text representation and classification model module,through the test,shows that this system can realize the basic function of Chinese patent automatic text classification.
Keywords/Search Tags:Patent text, Convolution neural network, Bidirectional short-and-time memory network, Attention mechanism
PDF Full Text Request
Related items