Font Size: a A A

Research On Short Text Classification Method Based On Contextual Feature Expression

Posted on:2022-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:J Y WangFull Text:PDF
GTID:2518306767977389Subject:Intelligent computing and systems
Abstract/Summary:PDF Full Text Request
On the Internet,forum posts,movie reviews,product purchase reviews and replies,consultations,suggestions,and instant chat records(MSN/QQ/We Chat)are usually short texts.Automatic text classification for such text content has a wide range of uses,such as judging whether the audience's preference for a movie is positive,negative or neutral based on the content of the movie review.Therefore,the short text classification problem based on natural language processing technology has become a research hotspot.Text classification methods are mainly divided into two categories,namely methods based on traditional machine learning and methods based on deep learning.In traditional machine learning methods,the accuracy of text classification is closely related to the quality of text feature extraction,and sometimes text feature extraction requires manual processing,and the quality of feature extraction is not high,resulting in unsatisfactory text classification accuracy;The method based on deep learning is to train data through deep learning models such as CNN(Convolutional Neural Networks),without the need for manual feature extraction of the data,and the impact on the accuracy of text classification is more about the size of the data set and the number of iterations of training.Compared with long text,short text has the characteristics of less information and higher degree of freedom of sentence expression,which brings many difficulties to classification.In order to improve the accuracy of text classification,it is necessary to fully mine the textual context information to extract its deep semantic features,and use the deep semantic features to optimize short text classification tasks.In view of the above problems,this paper mainly completes the following work for the short text classification problem:(1)A short text classification model BBLNN(BERT-Bi LSTM-Neural-Network)based on contextual feature expression is proposed.The model first uses BERT(Bidirectional Encoder Representations from Transformers)to extract low-level contextual features from the input text;secondly,in order to further explore the correlation between low-level feature contexts and strengthen feature expression,it is proposed to use Bi LSTM(Bi-directional Long Short-Term Memory)for low-level contextual features.The idea of bidirectional modeling of features;finally output the text category in an end-to-end manner.(2)The performance of the BBLNN model on the short text classification task is verified.This paper conducts experiments on the public dialogue dataset MRDA.The experimental results show that the classification accuracy of the BBLNN model can reach 89.92%,which is better than some advanced short text classification methods.
Keywords/Search Tags:short text classification, deep learning, BERT model, Bidirectional Long Short-Term Memory Neural Network
PDF Full Text Request
Related items