Feature Weight Optimization For Short-Text Multiclass Classification

Posted on:2023-05-06

Degree:Master

Type:Thesis

Country:China

Candidate:W L Peng

Full Text:PDF

GTID:2568306614472734

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of social networks,the number of short texts has increased dramatically,such as chat conversations,online shopping comments,etc.,flooding people’s lives,exacerbating the problem of information overload.Short text multiclassification combined with deep learning has become an important technical means to improve the problem of information overload.Feature learning for deep modeling of short texts has always been a research hotspot in academia and industry.Feature refers to the core information in text.Whether it is focused on text modeling,that is,feature weight,affects the quality of text modeling and classification.Therefore,it is of great significance to strengthen the feature expression in the process of text modeling to improve the classification accuracy.The current research mainly focuses on enhancing feature expression based on statistical information or clustering algorithms to optimize feature weights,but it is not deep enough to optimize feature weights based on deep learning modeling.To this end,this paper proposes a model that strengthens or weakens feature expression in short text depth modeling to optimize feature weights,so as to improve the accuracy of short text multi-classification.In order to strengthen the feature expression in the deep modeling of short text and optimize the feature weight,this paper compares the X-LSTM feature embedding model and the LSTM-ATT attention model.The experimental results show that compared with the classic LSTM baseline model,the enhanced feature representation can improve the accuracy of short text multi-classification.In view of the characteristics of insufficient contextual information,small number of words and sparse features of short texts,the high-weight attention feature words selected by the LSTM-ATT attention model have a greater impact on the classification accuracy than long texts.Based on the LSTM-ATT model,an RTR-LSTM-ATT model is proposed to enhance the expression of attention features in short text modeling.Firstly,based on the principle of the attention mechanism,the attention feature words are selected by dimensionality reduction and reconstruction of short text,and then they are added to the attention weight calculation part of the LSTM-ATT model for integration,so that the model output has a greater dependence on the attention feature words..The experimental results show that compared with the classic LSTM-ATT model,the enhanced attention feature expression can effectively improve the accuracy of short text multi-classification.In view of the problem that the past work only focuses on the completeness of features and ignores whether the features have a negative impact on short text classification.Based on the RTR-LSTM-ATT model,an RC-LSTM-ATT model that weakens the confusing feature expression in short text modeling is proposed.First,the confusing feature words in the short text are selected based on the TF-IDF principle.If there are confusing feature words in the attention feature words,they will be excluded,and the filtered feature words are called strong feature words.Also,the strong feature words are embedded to express and added to the attention weight calculation part of the classic LSTM-ATT model for integration to make the model The output is less dependent on confusing feature words.The experimental results show that compared with the RTRLSTM-ATT model,weakening the confusing feature expression can further improve the accuracy of short text multi-classification.

Keywords/Search Tags:

short text multi-classification, deep learning, feature representation, feature weight, attention mechanism

PDF Full Text Request

Related items

1	Research And Application Of Chinese Short Text Classification Algorithm Based On Deep Learning
2	Research On Classification Method On Chinese Short Texts With Few Words Based On Feature Representation
3	Research On Short Text Classification Method Based On Feature Extension
4	Short Text Classification Based On Feature Extension
5	Research On Text Representation Model And Deep Learning Algorithm In Text Classification
6	Bi-LSTM Short Text Emotion Analysis Combining Semantic And Self-attention Mechanism
7	Short Text Sentiment Analysis Based On Deep Learning
8	Algorithm Research On Text Classification And Named Entity Recognition Based On Deep Text Feature Representation
9	Multitask Text Classification Based On Deep Learning
10	Research On Short Text Classification Based On Feature Representation And Graph Convolutional Network