Font Size: a A A

Research On Multi-label Text Classification Based On Hybrid Neural Network

Posted on:2022-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:P W XiaoFull Text:PDF
GTID:2518306530980129Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,the text information in the Internet has exploded,leading to the phenomenon of information overload.Efficient and accurate classification of a large amount of text information can solve this problem well.In real life scenarios,text information often has multiple tags.Single-label text classification has been difficult to accurately locate text information.Multi-label text classification has gradually become a basic and important research direction in natural language processing.Therefore,this paper studies the feature extraction and text representation technology of multi-label text classification,and builds a multi-label text classification model based on hybrid neural network.The main research work of this paper includes the following aspects:(1)The neural network model can realize automatic feature extraction,but a single neural network model has the problem of insufficient feature extraction in the feature extraction process.Therefore,a hybrid neural network model fusing Bi GRU and Capsule models is proposed.The Bi GRU model uses the context information of the current word to extract the global feature representation of the input text;the attention mechanism selects the text features that are more critical to the classification task by assigning corresponding weights to different features;the Capsule model extracts the text through a dynamic routing mechanism The local feature representation.Finally,experiments show that the constructed hybrid neural network model ATT-Bi GRU-Capsule has more comprehensive text features and its performance indicators are better than other comparison models set in the experiment.(2)In the process of model training,traditional text representation methods have the problems of insufficient learning and excessive simplification of vector representations,while pre-training models with dynamic word vectors can allow the entire model to learn from a better initial state.And it handles the ambiguity of words well,so a multi-label text classification method based on pre-training model is proposed.First,we built a BERT-FC model that adds a fully connected layer to the output of the BERT-base model to adjust the dimension of the output features,and then verified that the use of the BERT-FC model is better than other models on the classic multi-label English data set AAPD Get better performance;then introduce the BERT pre-training language model as the embedding layer of the ATT-Bi GRU-Capsule model,and input the output dynamic word vector into the feature extraction layer,and compare it with the Word2 Vec model as the embedding layer.The experiment shows The model has a higher accuracy rate,which proves the importance of text representation in the process of text feature extraction,and further improves the accuracy of classification.
Keywords/Search Tags:Multi-label text classification, Capsule model, Text representation, Pre-training model
PDF Full Text Request
Related items