Research On Text Multi Label Classification Algorithm Based On Self-attention Mechanism And Graph Convolution Network

Posted on:2023-01-26

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Liu

Full Text:PDF

GTID:2558307097495064

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Text multi-label classification refers to classifying text data into a set of associated labels.It is an important research direction in the field of text processing,mainly involving text feature extraction and label correlation.The current feature extraction networks,such as TextCNN,extract text features in the form of convolution,which easily ignore the semantic relationship between non-consecutive words in the text,thereby affecting the classification results.At the same time,compared with single-label classification tasks,multi-label classification tasks need to consider the correlation between labels,and current methods tend to ignore the global correlation information of labels.Therefore,traditional neural networks,such as TextCNN,are difficult to meet the needs of practical applications.This paper proposes a text multi-label classification model,which introduces a self-attention mechanism and a graph convolutional network to improve the effect of text multi-label classification models in both text feature extraction and label correlation modeling.The self-attention mechanism is used to mine the semantic relationship between words and labels and generate word vectors containing the semantic relationship between non-consecutive words in the text and label vectors containing the semantic information of related labels.The new word vectors are used to solve the problem that the TextCNN network tends to ignore the semantic relationship between nonconsecutive words in the text during feature extraction.The new label vectors are used to semantically consider label correlation.When modeling label correlation in graph convolution network,the new label vectors are regarded as the vertices of the label directed graph and the co-occurrence probability of labels in the dataset are regarded as the edges.In this paper,the graph convolutional network is used to map such label directed graphs to a set of object classifiers to mine the global association information of labels.Experiments show that the proposed method achieves optimal results on the datasets WOS,RCV1-V2,and Toxic with mAP scores of 69.46%,74.88%,and 91.58%,respectively,when compared with 6 different models.The model improves the mAP score by 2%compared to the benchmark model TextCNN.At the same time,to confirm the effectiveness of the self-attention mechanism and graph convolutional network used in this model,multiple sets of ablation experiments were carried out.This paper designs and implements a learning resource archiving and recommendation system.In terms of resource archiving,we construct a treelike knowledge point system and produce a subject resource dataset containing 17,116 texts through web crawler and other forms.For the text data of the tree-like knowledge point system in subject resources,the text multi-label classification model in this paper can effectively perform feature extraction and label correlation modeling.Experiments show that this model is better than 6 different models,with a mAP score of 99.26%,which is 3.2%higher than the benchmark model TextCNN.Therefore,this model has certain practical application ability in resource archiving.In terms of resource recommendation,we adopt the UserCF algorithm.Considering the actual classroom scene,we additionally consider time and grade point factors when calculating student similarity.Finally,based on the subject resource data,we recommend learning resources corresponding to the doubtful knowledge points for students.The text multi-label classification model in this paper does not consider the situation of unbalanced data and extremely large number of labels,and will be improved in the follow-up research work to further meet the needs of practical applications.

Keywords/Search Tags:

Text multi-label classification, Self-attention mechanism, Graph convolutional network

PDF Full Text Request

Related items

1	Research On Multi-label Text Classification Based On Text And Label Representation Optimization
2	Research On Multi-Label Text Classification Based On Deep Learning
3	Research On Text Classification Tasks Integrating Label Informatio
4	Research On Feature Extraction Of Multi-label Text Classification
5	Research On Multi-Label Text Categorization Based On Label Embedding Information
6	A Bad Text Recognition Based On Multi-feature Graph Convolutional Embedding
7	Research On Multi-label Text Classification Methods Based On Attention Mechanism
8	Research On Multi-Label Image Classification Algorithm Based On Graph Convolution Network
9	Research And Application Of Multi-label Classification Algorithm Based On Deep Learning
10	Research On Text Multi-Label Classification Based On Heterogeneous Graph Attention Network