Font Size: a A A

Research On Patent Text Classification In Aerospace Field Based On Deep Learning

Posted on:2024-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:H X ZhengFull Text:PDF
GTID:2542307076473994Subject:Information Science
Abstract/Summary:PDF Full Text Request
As one of the high-tech industries in China,the patents in the field of aeronautics and astronautics have great research value.How to effectively classify the massive patent texts in the field of aeronautics and astronautics is an important part of the current patent development.In recent years,the number of Chinese patents in the field of aeronautics and astronautics has exploded,and the classification is mainly based on artificial classification or semi-automatic classification.Because the patent text has certain particularity,the classification effect is not accurate,and there are a lot of manpower,material resources and time consuming problems.With the continuous development of artificial intelligence technology,deep learning technology is gradually widely applied in the field of text classification.Deep learning model has strong learning ability in feature extraction and text representation,which can solve the problem that traditional classification methods do not have deep understanding of text semantic analysis,syntactic structure and before and after sequence,and the representation effect is not obvious.Overreliance on extracted features.To solve the above problems,the deep learning method is adopted in this paper to classify patent texts in the field of aerospace.The main work includes the following aspects: First,the patent text data from the field of aerospace is obtained and the data set is constructed by means of web crawler and patent download from the official patent website.The patent text data set is preprocessed,including word segmentation,stop word and text vectorization.Secondly,the patent status in the field of aerospace is analyzed,including the characteristics of aerospace patent text and the problems existing in the classification system of aerospace patent.Finally,an experiment is carried out to classify the patent texts in the field of aeronautics and astronautics.Firstly,a Convolutional Neural Network(CNN)model based classification method is proposed,and the features of the patent texts are fully extracted through the CNN model.It is also consistent with Bidirectional Long Short-Term Memory(Bi LSTM),Bi LSTM-CNN,Bi-directional Gate Recurrent Unit,Bi GRU and other neural network models for comparison experiments.The results show that the CNN classification model selected in this paper has a better effect,and the accuracy rate,recall rate and F1 value are greatly improved.Secondly,BERT(Bidirectional Encoder manifest from Transformers)model was introduced on the basis of CNN,and a method of patent text classification in aerospace field based on BERT-CNN model was proposed.The patent text pretraining model based on Chinese word segmentation is trained to learn the internal relationship of sentences by fully associating the context,which can solve the problems of the traditional text representation,such as not closely connected context and polysemy of the word.Its accuracy rate,recall rate and F1 value reach 87.79%,87.65% and 87.65% respectively.In this paper,the classification of patent texts in the field of aerospace based on deep learning is conducive to promoting the development of scientific and technological innovation achievements in the field of aerospace,improving the retrieval efficiency of patent texts,promoting the use efficiency of high-tech achievements in the field,and helping to realize the sustainable development of scientific and technological innovation.
Keywords/Search Tags:Aerospace, patent text classification, deep learning, CNN model, BERT-CNN model
PDF Full Text Request
Related items