Font Size: a A A

Researches On Deep Learning Based Hierarchical Multi-label Text Classification Algorithms

Posted on:2022-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LuoFull Text:PDF
GTID:2518306551970299Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of Internet,traditional television news media begins to transform into media converagence and TV news is pushed to Internet terminal and can be viewed by more people.The 14 th Five-Year Plan points out that the deep integration of media should be promoted,the new mainstream media should be strengthened and the level of public cultural services should be improved.In the process of transforming traditional news media to converged news media,the news labeling technology is becoming more and more important because it can not only analyze and help to understand the news content for sorting and categorization,but also provide Internet users more accurate search and recommendation services.Therefore,applying multi-label text classification to label the news data in a hierarchical,fine-grained manner can save labor costs and increase the value of news.Multi-label classification algorithms assign multiple labels to a sample.They are widely used in the fields of recommendation systems,public opinion analysis,sentiment classification and so on.There are correlations between labels and learning those in modeling process is a huge challenge.In the TV news text classification,the categories are in a hierarchical structure.Meanwhile,different labels are related with different parts of news texts.The fusion method of label and text features needs to be specifically designed.Aiming at these two problems,we propose two deep learning based hierarchical multi-label classification algorithms.The main work is as follows:(1)In the process of building a convergent media content management platform project,we obtained large news releases consisting of various news in recent years from TV stations.Based on these releases,we construct a hierarchical multi-label text classification data set.(2)A method based on bi-directional hierarchical attention mechanism for hierarchical news classification is proposed.The existing algorithms usually ignore the association between labels or model the dependencies in an unidirectional manner,resulting in the well-known errorpropagation problem.The model in this paper designs a hierarchical attention module to capture the correlation between text content and label embedding,and bidirectionally model the dependencies between different level of labels.It extracts more accurate features for classification.(3)A method fusing graph convolutional network for hierarchical news classification is proposed.The model aims at solving the problem of insufficient feature fusion between label and text.For the label tree structure,the model uses graph convolutional neural network to capture the dependency between labels and integrates the extracted label features with text features using the multi-head attention mechanism,leading to improvement of the performance on news data set.
Keywords/Search Tags:text classification, hierarchical multi-label, deep learning, pretrained language model, attention mechanism
PDF Full Text Request
Related items