Font Size: a A A

Research On Sentiment Classification Of Microblog Comments Based On Deep Learning

Posted on:2022-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:K X XuFull Text:PDF
GTID:2518306566990529Subject:System theory
Abstract/Summary:PDF Full Text Request
In the background of the rapid development of the Internet,online social media has developed rapidly.As the media evolves,Weibo has become one of the largest social media platforms in China.The sentiment classification of Weibo comments is particularly important.In the era of the prevalence of big data and artificial intelligence,sentiment analysis technology is discovered and promoted and applied,and the development of deep learning in the direction of sentiment classification is becoming more and more popular.Combining with the deep learning method,we find that the current popular BERT model is based on the character-level encoding adopted by the Chinese encoding method,while the English encoding uses word-level encoding,which to a certain extent will cause the model to be less effective in Chinese classification than in English classification;at present,the learning model has high redundancy,slow calculation speed and large consumption of memory caused by excessive parameterization.This paper improves the existing algorithms in response to the above problems,and aims to construct an accurate and efficient Weibo comment sentiment classification model.The main innovations of this article are as follows:In order to eliminate the restriction of Chinese coding,Based on the mainstream deep learning algorithm,this paper constructs the fusion algorithm Bert-CNN,and use the semantic extraction capabilities of BERT and the local feature extraction capabilities of CNN to eliminate the impact of BERT character-level encoding and improve the classification effect of the model.We use BERT to obtain word vectors with sentence global features such as high-level features such as semantics,word order,contextual connection,etc,then input the word vectors into CNN to obtain more accurate classification results.And compare with ELMo-CNN,GPT,BERT.The test results on the dataset in this article and the simplifiedweibo?4?moods public dataset show that the Micro?P,Micro?R,and Micro?F1 of BERT-CNN have different degrees of improvement compared with the other three models.Experiments have proved that BERT-CNN has better effect of text classification and confirms that our method has a strong ability to capture the semantics of sentences and good classification performance.In view of the problems of over-parameterization,slow computation speed and large memory consumption of Bert-CNN model,we propose a new compression method –Gradually Replacement to compress it into GRBERT-CNN model.The advantage of this method is that the original module and the replacement module are trained together in the early stage.This training method can make the replacement module better inherit the "characteristics" of the original module.This method adopts two substitution strategies:constant substitution strategy and non-constant substitution strategy;In the later stage,all the trained replacement modules will be combined into a new compression model to replace the original model,and the final fine-tuning will be carried out.This method does not use the additional loss function,but still adopts the loss function of micro-blog comment classification task.Compared with the Bert CNN model,the accuracy of the GRBERT-CNN model is only reduced by 1%-2%,and the performance of the GRBERT-CNN model is retained by 97%.The model size was reduced from 112 M to 62 M,which was reduced by about half.The running time of this model is reduced from 2.9h to 1.3h,which has double computing speed,and the memory occupation rate is significantly reduced,which solves the problems of slow running speed,large amount of computation and memory consumption of the model,and also confirms the feasibility of the method.
Keywords/Search Tags:sentiment classification, BERT-CNN, Gradual Replacement Method, GRBERT-CNN
PDF Full Text Request
Related items