Font Size: a A A

Research And Application On Deep Transfer Learning Algorithm In Text Classification

Posted on:2021-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:Q L MuFull Text:PDF
GTID:2428330629488925Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the information age,massive amounts of data are generated by information systems such as social networks,video surveillance,and intelligent transportation.How to find valuable information in enormous data is one of the hot topics of continuous research,especially in the text information mining.Text classification is the main task in the field of text data mining.In recent years,deep learning algorithms have been widely used in text classification tasks with their powerful feature expression capabilities.Term vector is the basis for the introduction of deep learning in text classification.Part-of-speech information is not considered in the existing Word2 vec term vector model,so it cannot distinguish the meaning of words well.In addition,deep learning requires a large amount of labeled data to learn potential data features.However,it is very complicated to build a large-scale,high-quality labeled data set,which leads to overfitting caused by deep learning when the data is insufficient.In this regard,this article focuses on the following:First,a Convolutional Neural Network Based on Part-of-Speech Features Text Classification Model(TextCNN based on POS Features,POS-TextCNN)is constructed.Aiming at the problem that part-of-speech information is not considered in the Word2 vec term vector model,the model in this paper adds a text representation input layer with part-of-speech features to the input layer of the classic text convolutional neural network model,and combines it with the word vector representation to form a dual-channel input to solve the problem that Word2 vec term vector model is impossible to distinguish between polysemes.The experiments of the sentiment classification on Amazon product review data show that the model's precision,recall,and F1 values are higher than the TextCNN model,indicating that part-of-speech text representation has a certain role in the POS-TextCNN model.Second,the transfer learning algorithm based on the POS-TextCNN model(Transfer POS TextCNN,Tr-POS-TextCNN)is proposed.As the POS-TextCNN model is prone to overfitting under insufficient data,this paper introduces the idea of transfer learning and transfers relevant knowledge in the source domain to ensure the accuracy of model classification.The experiments of the Cross-domain sentiment classification on Amazon product review data show that compared with other transfer learning algorithms,the Tr-POS-TextCNN algorithm improves the accuracy rate of the optimal algorithm by 1.92%-3.28%.In addition,this article also conducts experimental investigations on the superiority of the transfer learning and non-transfer learning algorithms of the POS-TextCNN model,the sensitivity and accuracy of the model parameters,and the impact of the training sample size of the target domain on the classification effect of the transfer learning,relevant conclusions have been reached.Third,the product review classification system based on the Tr-POS-TextCNN algorithm is designed and implemented.The application of the algorithm in this article to the cross-domain sentiment classification of the product review classification system proves the effectiveness and feasibility of this algorithm.
Keywords/Search Tags:Transfer learning, Deep learning, Convolutional neural networks, Text classification
PDF Full Text Request
Related items