Research On Text Classification Method Based On Graph Convolutional Neural Network

Posted on:2022-03-03

Degree:Master

Type:Thesis

Country:China

Candidate:B Peng

Full Text:PDF

GTID:2518306485485974

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rise of various social networking platforms,text is the main information carrier of these platforms,and the amount of data is growing at a high speed every day.How to correctly handle these massive amounts of text information,that is,to categorize and apply text,has become an important issue.In recent years,deep learning methods for text classification have developed rapidly,which can process large-scale text data quickly and accurately,and have broad application prospects.Therefore,this thesis aims at the deep learning method of text classification,and has made research progress in the following two aspects:(1)Propose a text classification method based on improved Cluster GCN.First of all,for the over-fitting problem that may be caused by insufficient training data in the text classification process,this thesis proposes to use Dropedge to train the adjacency matrix A,which achieves the purpose of obtaining different random deformations of the input model data without affecting the original features.The purpose of deformation is to achieve the effect of data enhancement.Then for the classic GCN model in text classification,saving the embedding of each node into the memory will lead to excessive memory consumption and high requirements on hardware conditions.The Cluster GCN originally aimed at graph classification is applied to text classification,by increasing the diagonal weight of the adjacency matrix,enhance its own characteristics.Further,by constructing partitions on the words and document nodes in the text graph,that is,the adjacency matrix A,the connections between the documents and word nodes in the same area are more than the connections between different areas.Finally,in the neighborhood search process,only the nodes in the same area are sampled to reduce memory consumption and improve computational efficiency.(2)Propose a text classification method based on improved Fast GCN.First of all,for the shortcomings that One-hot encoding only reflects whether a word appears,it cannot reflect the importance of the word,nor can it express the relationship between different words.Using the Glove model to construct text features can make the text features include global statistical information and local context information,and improve the classification effect.Then,for the classic GCN model in text classification,it is straightforward and cannot classify the newly added text data.This thesis applies Fast GCN to text classification,and regards the original graph convolution in GCN as an integral transformation of the embedding function under the probability measure,it gets rid of the dependence on the test data.Finally,Focal Loss is used to measure the contribution of the loss of simple and easy-to-classify samples and difficult-to-classify samples to the total loss,increase the importance of the loss of difficult-to-classify samples,and improve the final classification effect.

Keywords/Search Tags:

Text classification, Deep learning, Dropedge, Glove, ClusterGCN, FastGCN

PDF Full Text Request

Related items

1	Research On Key Technologies Of Chinese Text Classification Based On Deep Learning
2	Reserch On Application Of News Text Classification Based On Deep Learning
3	Research On Text Classification Of Deep Learning Mixing Model Based On Map Reduce
4	Research And Application Of Text Classification Technology Based On Deep Learning
5	Multitask Text Classification Based On Deep Learning
6	Design And Implementation Of Long Text Classification Algorithm Based On Deep Neural Network
7	Research On Text Classification Based On Deep Neural Network
8	Research On Chinese Text Classification Algorithm Based On Deep Learning
9	Research And Implementation Of News Text Classification System Based On Deep Learning
10	Research On Text Classification Based On Deep Learning And Topic-driven