| Many ways login,rich text content,low barriers to access characteristics,making the micro-blogging become a social platform of information sharing in a very short period of time.Twitter users in the short character:a word,a picture,a video,even a symbolic expression,with the help of computers,mobile phones and other devices,anytime,anywhere to release their own see,hear,feel.The Bo is a powerful media fusion function,scraps of fragmentation expression and stimulate the micro-blogging by dynamic information flow by the formation of network public opinion generation,propagation and interaction of change.In the hot topic of the network,self regulating mechanism of individual can easily be overwhel med with the mood of the group,form a one-si ded mental attitude.In order to create a positive and healthy public opinion environment of micro-blog,efficient and rapid detection of micro-blog hot topicis very necessary.Therefore,this article from the content of micro-blog,combined with micro-blog context,through the clustering method,the occurrence of hot topics.First of all,because of micro-blogging short text with feature sparseness,redundancy huge characteristics to pretreatment of micro short text,so deal with the irrelevant information clearly and finishing,such as a variety of emoticons and stop words.Using vector space model of micro short text representation for modeling,combined with micro blog short text context,to join its metadata information to facilitate expansion feature space,in a certain extent,can effectively solve the problem of fragmentation of the micro-blogging,sparse.Secondly,this paper improves the classical algorithm Single-Pass.Traditional single pass incremental algorithm easily due to the topic of clustering center not only affect the latent topic clustering effect,after the improvement,effectively solve the problem of clustering results offset and improve the clustering accuracy,hot topic detection provides technical basis.Finally,experiments are used to test whether the relevant theory is based on the original theory to improve the piace,while establishing a hot topic evaluation model,to assess the effect of micro-blog short text clustering.Through the experiment,it can be found that micro-blogging feature space expansion,and the use of improved single pass dynamic clustering algorithm,in the quasi both in precision and greatly improved,illustrate the idea of this paper and the algorithm has high theoretical and practical value. |