Research On Text Classification Based On Deep Learning

Posted on:2021-06-02

Degree:Master

Type:Thesis

Country:China

Candidate:X F Li

Full Text:PDF

GTID:2518306575965489

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Nowadays is an era of information technology.With the rapid development of computer technology and information storage,the application field has gradually penetrated into every aspect of life.Text data is growing at an exponential rate.The effective collection,sorting,mining and analysis of pharmaceutical patent data are increasingly important for the development of the pharmaceutical industry.Text classification is mainly divided into three modules of preprocessing,feature extraction and classification recognition,among which text representation is the key point and the foundation.At present,the application of traditional text classification technology is mostly based on statistical learning and other methods,ignoring the association between words and the information hidden in the text context,which is not applicable to complex and structured text data.The unique network structure of deep learning is efficient in solving the current text classification problem.Based on the analysis and summary of text vector technology and deep neural network model,this paper makes an in-depth study on the application of deep learning model to solve text classification problem.The main research work of this paper is as follows:1?In this paper,a label classification model based on convolutional neural network(CNN)is designed.In the process of convolution and pooling,features of local information can be extracted effectively.Through dual-channel convolutional neural network,one set of word vectors can be fine-tuned to obtain more information,while the other set remains unchanged.The performance of the original model can be further improved as well as the network structure can be improved.Convolution kernels with different sizes and numbers can be designed to extract features from different angles.The maximum pooling method was used to further extract the features,and softmax function was used to classify them.2?According to the advantages of LSTM in extracting global features,the accuracy of medical patent label classification can be enhanced by combining the attention mechanism.In order to represent the patent text in a deeper level by using the structure and hierarchical information of sentences,by using the relevant information between tags,a Bi-LSTM network model based on attention mechanism is designed.Among them,LSTM model solves the gradient disappearance problem of traditional RNN,and the hidden state sequence of forward and reverse LSTM output in this architectures is connected into double channels,which can avoid the loss caused by direct addition.At the same time,the semantic encoding containing the attention probability distribution of input sequence nodes is obtained by means of attention mechanism,which highlights the role of key information,reduces the information loss and redundancy in the process of feature vector extraction.

Keywords/Search Tags:

deep learning, patent classification, long short-term memory, attention mechanism, convolutional neural networks

PDF Full Text Request

Related items

1	Short Text Sentiment Classification Based On Deep Learning
2	Research On Deep Learning Algorithm For Sequence Data
3	Text Sentiment Classification Based On Attention Mechanism
4	Research On Relation Classification Via Bidirectional Long Short-Term Memory Networks With Attention Mechanism
5	Research On Network Intrusion Detection Method Based On Bi-LSTM
6	Research On Image Captioning Method Based On Deep Neural Networks And Adaptive Attention Mechanism
7	Research On Text Emotion Classification Algorithm Based On Deep Learning Technology
8	Research On Text Sentiment Classification Method Based On Deep Learning
9	Group Activity Recognition Algorithm Research Based On Attention Mechanism And Deep Learning Network
10	Research On Aspect Level Sentiment Classification Based On Deep Learning