Research On Short Text Classification Based On Graph Attention Networks

Posted on:2021-10-03

Degree:Master

Type:Thesis

Country:China

Candidate:T Nie

Full Text:PDF

GTID:2518306104988449

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the popularization and application of smart devices,a large amount of fragmented short text messages are generated in daily life,such as mobile phone text messages,social dynamic sharing,search sentences,and product reviews.In order to mine more potential business value from these massive short text information,the task of short text classification has received more and more attention.Due to the unique attributes of short text data,it is more difficult to classify than long text data.On the one hand,short text is generally short and concise,and its grammar is not standardized,which leads to its sparsity of feature and lack of information;on the other hand,the amount of short text data is large and updated quickly,however we lack a large amount of labeled data for training.Short text classification task is the main research goal,by analyzing the advantages and disadvantages of different classification algorithms,a short text classification algorithm based on graph attention network is proposed.The main contents include:(1)The Co-occurrence Information Model(CIM)is proposed to construct the graph structure of short text data sets,so that the supplementary information of graph structure can effectively alleviate the sparseness of short text data.Specifically,we segment the short text in the corpus,then treat the words and short text as nodes in the graph,and use the co-occurrence information to construct the edges between word-word,word-text,and text-text.The co-occurrence information is obtained based on PMI,TF-IDF,Cosine similarity.(2)The graph neural network classification model is applied to the constructed graph data to classify the short text nodes in the graph.Specifically,a graph convolutional network(Graph Convolutional Networks,GCN)is used as a basic model to build a CIM-GCN model,and its advantages and disadvantages are analyzed from the principle;then,an attention mechanism in the graph is introduced to improve a graph attention network Networks,GAT)and get the CIM-GAT model;further,in order to extract and fuse attention features from different feature subspaces,the CIM-MGATs model is proposed,which mainly refers to the idea of multi-head attention.(3)In order to overcome the difficulty of lacking training data,a graph-based semi-supervised learning method is constructed.The labeled data and unlabeled data are used to build a graph together to enrich the graph structure information,and then the entire graph is modeled so that the label information and data features are effectivelypropagated in the graph structure,finally,the final representation and prediction results of all nodes in the graph can be obtained.Finally,this paper conducted experiments on short text classification data sets such as HR and MR,and found that the CIM-GAT and CIM-MGATs models based on graph attention network not only have higher classification accuracy than other models,but also more robust to the size of the training data.

Keywords/Search Tags:

Short Text Classification, Co-occurrence Information, Graph Neural Network, Attention Mechanism, Semi-supervised Learning

PDF Full Text Request

Related items

1	Research On Text Classification Algorithm Based On Graph Neural Network
2	Graph Neural Network With Mixed High And Low Order Information And Its Application In Semi-supervised Classification
3	Text Classification Based On Semi-supervised Learning
4	Research On Key Technologies Of Short Text Classification Based On Deep Learning
5	Research On Short Text Classification Of Semi-supervised Pre-training Based On Autoencoders And Word Order Dependencies
6	Key Information Extraction Of Sequence Data Based On Deep Neural Network
7	Research On Improved Dense Net Algorithm In Short Text Classification
8	Text Sentiment Classification Based On Attention Mechanism
9	Improved BP Neural Network Combined With Semi-supervised Algorithm And Its Application On Text Classification
10	Research On Short Text Classification Based Upon Convolution Feature Encoding And Attention Mechanism