| Text classification aims to help us quickly and accurately filter out valuable information from massive text data.Although the text classification method based on the graph neural network has attracted wide attention due to its ability to deal with non-European structured data and its superiority in classification accuracy,there are still some shortcomings in the following aspects:First,the traditional graph structure is not effective in expressing text data.There are limitations in terms of correlation;second,the current traditional graph neural network models only rely on word co-occurrence to construct text graphs,while ignoring hidden information from other angles in the text;At this time,problems such as the inability to construct effective text representations through homogeneous nodes.In view of the problems existing in the current text classification algorithm based on graph neural network,the research content of this paper is as follows:The existing graph neural network text classification model can only effectively learn the paired binary relationship between words,while ignoring the multivariate high-order relationship between words,and its insufficient representation of text context semantic information and local feature information the problem.In this paper,we propose a hypergraph attention network text classification model with multi-feature fusion.First,multivariate highorder connections between words are learned by introducing a hypergraph structure instead of the traditional graph structure.Secondly,word order hypergraph,syntactic hypergraph and semantic hypergraph are respectively constructed according to text data to enrich the feature representation of text information,so as to make up for the insufficient representation of text information in graph neural network.Then,the dual graph attention neural network is used to learn the embedding representation of the word node in the hypergraph and the embedding representation of the relationship hyperedge,and the self-attention graph text pooling module is used to extract important word nodes in the text to help the graph neural network capture the text Deep local feature information enables the model to better represent text feature information.Finally,through the method of adaptive fusion,three different types of text features are fused to generate the final text embedding vector,so as to achieve the effect of improving text classification.Aiming at the problems of sparse data and lack of sufficient contextual semantic information in short text classification.Traditional isomorphic graphs cannot construct effective and rich text feature representations when faced with short texts with sparse data.Therefore,this paper proposes a knowledge-augmented heterogeneous graph attention network for short text classification.First,the model effectively alleviates the problem of data sparsity in short texts by building a heterogeneous graph that fuses conceptual information of words,documents,tags,topics,and entities.Secondly,the heterogeneous graph attention network is used to replace the traditional graph convolutional network,and the feature learning of heterogeneous nodes on the heterogeneous graph is realized through a dual-level attention mechanism,which starts from the perspective of node level and type level respectively.Assigning weights to heterogeneous nodes in the graph more accurately describes the importance of different types of nodes to the current text feature representation.Then,the learned word node representation and BERT word embedding are spliced and input into the bidirectional long short-term memory network model to mine the contextual semantic features in the short text.Finally,it is fused with the text features obtained by the graph attention network model for short text classification.The short text classification model combining these two methods effectively solves the problems of data sparsity and lack of sufficient context information in short text classification. |