Font Size: a A A

Research On Text Classification Of Weibo Trending Hashtag Based On "Qinglang" Action

Posted on:2022-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:H Y GuanFull Text:PDF
GTID:2518306779469544Subject:Trade Economy
Abstract/Summary:PDF Full Text Request
The development of technology and network provides a certain foundation for the derivation of social media's applications and network media further affects people's daily communication and information acquisition activities.Among them,Sina Weibo is a social network medium,and the trending hashtag function has gradually become one of the significant ways for people to obtain daily news and current affairs.With the increasing popularity of Weibo's trending hashtag function,a large amount of data information also emerges,and the generated short text data contains a lot of useful information,which has the characteristics of large quantity,short text length,high sparsity,insufficient information of sharing contexts,unofficial expressions and word co-occurrence,and makes it difficult to obtain using common feature extraction methods.Therefore,classifying useful information in short text is challenging.At the same time,in order to "clean up" the chaos on the online platform,the Cyberspace Administration of China has resolutely carried out 2021 "Qinglang" special cleaning action.In order to solve the problems in the classification of short text and evaluate the effect of the "Qinglang" special action,this thesis is based on the real topic of Weibo trending hashtag data.The specific results are as follows:(1)Construct the corpus using LDA and Word2 vec.The Word2 Vec training method is used to solve the problems of data sparsity and high dimension in the previous text vector description methods,and the LDA model is used to perform weighted operations on the word vectors to form a corpus and make data category labels.(2)CNN and Bi LSTM methods based on LDA and Word2 vec word vector extension to design a short text classification system for Weibo trending hashtag terms in the field of social media.Taking the newly constructed corpus as an input sample,a text classification model(WL-CNN-Bi LSTM)based on the fusion of deep learning network CNN and Bi LSTM is built.The model can effectively extract the hidden features in the short text of Weibo trending hashtag entries,in order to implement the topic classification of the short text.The evaluation results of the WL-CNN-Bi LSTM model are better than CNN,LSTM,GRU and Bi LSTM.(3)Statistical analysis and classification results.Specifically,the kernel density estimation method was used to analyze the popularity of Weibo trending hashtag topics before and after the launch of the "Qinglang" special campaign,as well as the dynamic characteristics00 and differences in the distribution of each topic's popularity.The results show that after the start of the 2021 "Qinglang" special campaign,from June 15,2021 to December 31,2021,the popularity of entertainment and advertising topics in the content of Weibo trending hashtag entries showed a downward trend,while the hot of topics such as sports,society,and daily sharing of life is on the rise,which shows that the Weibo trending hashtag platform has been effectively rectified,and the action has achieved good results.However,Weibo trending hashtag contents have the characteristics of strong real-time.So the corpus can be further optimized by connecting to the network to update this model.After extensive experiments with different deep learning methods,this thesis verifies the superiority of the method and contributes to the field of new media short text classification.
Keywords/Search Tags:"Qinglang" action, Weibo trending hashtag, Short text classification, CNN, BiLSTM
PDF Full Text Request
Related items