Font Size: a A A

Research On Microblog Topic Detection Based On Feature Enhancement And Convolutional Neural Network

Posted on:2020-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:H T YangFull Text:PDF
GTID:2428330575455087Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the vigorous development of mobile Internet technology,microblog has become the leading social media platform in China.With its functions of real-time information releasing and powerful social interaction,microblog has attracted a large number of users to participate in the interactive discussion on the platform in just a few years.People express their views and share their knowledge through the microblog platform.Thus,the microblog messages contain a lot of current political information as well as social hotspots.Through analyzing,processing and summarizing the microblog data,the hot topics can be clearly recognized,which not only help users to understand current events timely,but also assist the government to deal with the public opinion in early warning.Compared with the traditional network media,the microblog social media has the characteristics of short content,low barriers and high user activity.Therefore,the topic detection to the microblog is difficult with problems such as sparse message of text,high noise of content,rapid updating of topics,and so on.In view of the existing problems,the research improves the existing topic detection methods,and proposes two microblog topic detection models based on feature enhancement and convolution neural network.The main work of this article includes:1.Put forward a feature enhancement method of keywords based on Word2Vec:In order to solve the microblog's problem of short and sparse,this thesis uses Word2Vec's word vector space property is used to extend words with similar meanings to keywords of the microblog text into word vector representation.2.Put forward a vector representation method of the microblog text based on incremental TF-IDF weighting:Considering the real-time updating of the microblog data,this thesis proposes a feature weight calculation method based on the incremental TF-IDF,and uses weighted average method to vectorize the microblog text.3.Put forward a Single-Pass clustering algorithm based on time attenuation:Aiming at the timeliness of the microblog topics,this thesis proposes a method to compute the similarity of the microblog text based on time attenuation,and introduces the concept of"cluster center" into the traditional Single-Pass clustering algorithm,which improves the efficiency and accuracy of the clustering algorithm.4.Construct a microblog text classifier based on convolutional neural network:According to the hierarchical characteristics of microblog topics and convolutional neural network model,this thesis constructs a simple text classifier for the microblog.With the classifier,the microblog texts are classified effectively,and the problem of noise information interference during the clustering is solved.5.Put forward two detection models of the microblog topics:On the basis of the above work,this thesis proposes a F-MTD model for the detection of the microblog topics based on the feature enhancement,and a FCNN-MTD detection model based on the feature enhancement and convolution neural network.The FCNN-MTD model first divides microblog data into subject categories,and then clusters different types of the microblog data in parallel to find hot topics.Finally,this thesis uses the open microblog data sets and designs the relevant comparative experiments.The experimental results show that the proposed methods have good performance.
Keywords/Search Tags:Microblog, Topic Detection, Feature Enhancement, Convolutional Neural Network, Time Attenuation
PDF Full Text Request
Related items