Research On Multi-label Text Classification Based On Hybrid Neural Network

Posted on:2022-02-17

Degree:Master

Type:Thesis

Country:China

Candidate:P W Xiao

Full Text:PDF

GTID:2518306530980129

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of information technology,the text information in the Internet has exploded,leading to the phenomenon of information overload.Efficient and accurate classification of a large amount of text information can solve this problem well.In real life scenarios,text information often has multiple tags.Single-label text classification has been difficult to accurately locate text information.Multi-label text classification has gradually become a basic and important research direction in natural language processing.Therefore,this paper studies the feature extraction and text representation technology of multi-label text classification,and builds a multi-label text classification model based on hybrid neural network.The main research work of this paper includes the following aspects:(1)The neural network model can realize automatic feature extraction,but a single neural network model has the problem of insufficient feature extraction in the feature extraction process.Therefore,a hybrid neural network model fusing Bi GRU and Capsule models is proposed.The Bi GRU model uses the context information of the current word to extract the global feature representation of the input text;the attention mechanism selects the text features that are more critical to the classification task by assigning corresponding weights to different features;the Capsule model extracts the text through a dynamic routing mechanism The local feature representation.Finally,experiments show that the constructed hybrid neural network model ATT-Bi GRU-Capsule has more comprehensive text features and its performance indicators are better than other comparison models set in the experiment.(2)In the process of model training,traditional text representation methods have the problems of insufficient learning and excessive simplification of vector representations,while pre-training models with dynamic word vectors can allow the entire model to learn from a better initial state.And it handles the ambiguity of words well,so a multi-label text classification method based on pre-training model is proposed.First,we built a BERT-FC model that adds a fully connected layer to the output of the BERT-base model to adjust the dimension of the output features,and then verified that the use of the BERT-FC model is better than other models on the classic multi-label English data set AAPD Get better performance;then introduce the BERT pre-training language model as the embedding layer of the ATT-Bi GRU-Capsule model,and input the output dynamic word vector into the feature extraction layer,and compare it with the Word2 Vec model as the embedding layer.The experiment shows The model has a higher accuracy rate,which proves the importance of text representation in the process of text feature extraction,and further improves the accuracy of classification.

Keywords/Search Tags:

Multi-label text classification, Capsule model, Text representation, Pre-training model

PDF Full Text Request

Related items

1	Research On Multi-label Text Classification Based On Hybrid Neural Network
2	Research On Multi-label Text Classification Based On Improved Seq2seq Model
3	Research On Text Classification Algorithm Fusion Label Information And Capsule Network
4	Multi-label Text Classifification Model Based On Correlation-guided Representation
5	Research On Multi-label Text Classification Based On Text And Label Representation Optimization
6	Topic Awareness Model And Training Efficiency Optimization For Text Multi-label Classification
7	Research On Text Representation And Text Classification Method Based On Adversarial Training
8	Research On Multi-label Text Classification For Imbalanced Data
9	Research On Key Techniques Of Short-text Representation And Classification Based On Hybrid Semantic
10	Study On Topic Model Based Multi-label Text Classification And Stream Text Data Modeling