Font Size: a A A

Research On Structure-Enhanced Hierarchical Text Classification Algorithm

Posted on:2024-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:H K LiuFull Text:PDF
GTID:2558307181454394Subject:Electronic Information (in the field of computer technology) (professional degree)
Abstract/Summary:PDF Full Text Request
Text information is the most accessible information type with the largest amount of data.Faced with the explosive growth of various texts,people introduce labels to manage them.Labeling text can analyze and understand text content and provide more accurate search and recommendation services for Internet users.However,the traditional single label lacks a deeper understanding of data.Therefore,it can save labor costs and improve the use value of text by labeling the text in a hierarchical and fine-grained way.The hierarchical multi-label classification algorithm assigns one or more labels to each text from the manually predefined label structure tree.Among them,the labels in the label tree are hierarchical,and each sample has different correlation with each label.Therefore,the accuracy of tag tree structure definition,the modeling of correlation between tags,and the method of text and tag feature fusion are the main issues discussed in this field.Based on these problems,two hierarchical multi-label classification algorithms based on deep learning are proposed.The details are as follows:(1)A multi-label classification model integrating label structure is proposed,which constructs label semantic structure,combines predefined label hierarchy,and learns the similar features of the two structures through graph convolution neural network with shared parameters.Then,label features and text features are dynamically connected,and label simulation distribution is constructed as soft target to improve the robustness of the model.(2)An improved hierarchical global awareness model is proposed.The traditional method updates the tag features through neighborhood aggregation,which leads to the poor quality of the features learned by low-level labels and does not take into account the category imbalance caused by chained multi-labels.The common density coefficient is proposed to make the improved hierarchical global awareness model more fully use the label hierarchy to learn better quality of label and text features,to improve the accuracy of classification.
Keywords/Search Tags:Text classification, Hierarchical and multi-label, Deep learning
PDF Full Text Request
Related items