Font Size: a A A

Research On Text Classification Based On Word Sense Disambiguation And Convolutional Neural Network

Posted on:2019-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y L WangFull Text:PDF
GTID:2428330572950779Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text classification has been studied in depth as a classic research issue in Natural Language Processing(NLP).However,traditional text classification uses word frequency statistics to represent text and ignores the semantic information of text.With the arrival of the deep learning wave,deep learning has provided reference for NLP related tasks with its powerful feature self-learning ability.This paper aims to study text classification which is based on deep learning technology.The specific research works are as follows:(1)Research on Word Embedding Technology.Traditional text representation may lead to dimensionality disasters,and vector representations have the problem of "vocabulary gap." To solve this problem,according to the previous research on the model by scholars,paper chooses to train the word vector by using the Skip-gram model in the Word Embedding mechanism and maps the text data to a low-dimensional dense real vector space that can compute the semantic relationship.(2)A WSDPooling text classification model was proposed.This paper proposed a WSDPooling modle for the case of ignoring the meaning of words in different documents may be different or even opposite when feature extraction is performed on traditional BLSTM.The model uses the text context extracted by BLSTM to perform word sense disambiguation on the current word vector.After the disambiguation,the document feature maps are max-pooling,and directly input into the softmax classifier to complete the text classification task.(3)A WSDCNN text classification model was proposed.For the WSDPooling model,which ignores the local feature of the document and uses the ability of the convolutional neural network(CNN)to obtain partial features,a WSDCNN text classification model was proposed.After gets the word sense disambiguation document representation feature map,this model introduced a convolutional neural network,which combined the advantages of LSTM to extract global features and CNN to extract local features to complete the text classification task.(4)Study of the TensorFlow framework in depth,using the TensorFlow framework on four datasets to experiment with the proposed WSDPooling and WSDCNN models,both show better results than traditional machine learning algorithms,LSTM models,CNN models and related variants,verifying the recurrent neural networks and the advantages of convolutional neural networks are complementary in the effectiveness of text classification tasks.
Keywords/Search Tags:Text Classification, Long Short-Term Memory, Convolutional Neural Network, TensorFlow, Word Embedding
PDF Full Text Request
Related items