Font Size: a A A

Research On Text Classification Method Based On Graph Convolutional Neural Network

Posted on:2022-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:B PengFull Text:PDF
GTID:2518306485485974Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rise of various social networking platforms,text is the main information carrier of these platforms,and the amount of data is growing at a high speed every day.How to correctly handle these massive amounts of text information,that is,to categorize and apply text,has become an important issue.In recent years,deep learning methods for text classification have developed rapidly,which can process large-scale text data quickly and accurately,and have broad application prospects.Therefore,this thesis aims at the deep learning method of text classification,and has made research progress in the following two aspects:(1)Propose a text classification method based on improved Cluster GCN.First of all,for the over-fitting problem that may be caused by insufficient training data in the text classification process,this thesis proposes to use Dropedge to train the adjacency matrix A,which achieves the purpose of obtaining different random deformations of the input model data without affecting the original features.The purpose of deformation is to achieve the effect of data enhancement.Then for the classic GCN model in text classification,saving the embedding of each node into the memory will lead to excessive memory consumption and high requirements on hardware conditions.The Cluster GCN originally aimed at graph classification is applied to text classification,by increasing the diagonal weight of the adjacency matrix,enhance its own characteristics.Further,by constructing partitions on the words and document nodes in the text graph,that is,the adjacency matrix A,the connections between the documents and word nodes in the same area are more than the connections between different areas.Finally,in the neighborhood search process,only the nodes in the same area are sampled to reduce memory consumption and improve computational efficiency.(2)Propose a text classification method based on improved Fast GCN.First of all,for the shortcomings that One-hot encoding only reflects whether a word appears,it cannot reflect the importance of the word,nor can it express the relationship between different words.Using the Glove model to construct text features can make the text features include global statistical information and local context information,and improve the classification effect.Then,for the classic GCN model in text classification,it is straightforward and cannot classify the newly added text data.This thesis applies Fast GCN to text classification,and regards the original graph convolution in GCN as an integral transformation of the embedding function under the probability measure,it gets rid of the dependence on the test data.Finally,Focal Loss is used to measure the contribution of the loss of simple and easy-to-classify samples and difficult-to-classify samples to the total loss,increase the importance of the loss of difficult-to-classify samples,and improve the final classification effect.
Keywords/Search Tags:Text classification, Deep learning, Dropedge, Glove, ClusterGCN, FastGCN
PDF Full Text Request
Related items