Font Size: a A A

Semi-supervised Text Classification Based On Graph Attention Neural Networks

Posted on:2023-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:N N TaoFull Text:PDF
GTID:2568306806973099Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Text classification is one of the most basic and important tasks in natural language processing.It can discover the patterns between documents and discriminate the categories of different documents.Text classification uses common methods such as data mining and machine learning to effectively classify text or data,providing a more efficient method for text acquisition and utilization.With the advent of the era of web 2.0,various types of data and information have been generated,resulting in an explosive growth of data,which brings great trouble to information retrieval.How to quickly classify texts in massive text data,and realize the requirements of article recommendation,semantic analysis,information retrieval,information extraction and machine translation has become a research hotspot.This thesis firstly introduces the construction of text information graph based on feature fusion.Since some text data is relational data,there will be some relationship between data.For example,in the citation network,there is a citation relationship between texts in the citation network.By using the citation relationship,it is possible to clearly identify whether two or more articles are of the same category.The text information graph designed in this thesis is mainly constructed by extracting the outline features of the text,using the text as a node and the reference relationship as an edge,and constructing it according to the reference relationship between the texts.Then,through the multi-layer feature fusion relationship between the texts contained in the graph,the different associations between the texts are discussed.Secondly,a graph convolutional neural network model with residual connections is proposed.Since the neural network model is prone to over-smoothing when the number of layers is very deep,this limits the ability of the model to extract information from higher orders.For many unstructured data,there is also useful information for nodes that are far away and indirectly connected to them.Therefore,based on the graph neural network,we introduce residual connections.Residual connections ensure that network parameters in deep networks can be updated and learned,avoiding over-smoothing problems.Finally,a semi-supervised text classification model based on a two-layer attention mechanism is proposed.Graph Convolutional Neural Networks(GCNs)are widely used in text classification tasks due to their scalability and efficiency.However,when GCN performs classification,it needs to read the entire graph structure,but the graph structure is very large.Therefore,this thesis uses the characteristics of information propagation in graphs to propose a two-layer attention mechanism at the category level and node level,which can assign different weights to the adjacent nodes of the node,and avoid reading by only focusing on the local graph(the adjacent node area of the node).The entire graph structure,thereby reducing computational complexity and saving storage space.In order to prove the effectiveness of the model in this paper,a corresponding comparison experiment was done with other models.The experimental results show that the model has achieved good results on the five benchmark datasets.
Keywords/Search Tags:Text classification, Text information graph, Graph attention network, Residual connections, Attention mechanism
PDF Full Text Request
Related items