Font Size: a A A

Research On Sensitive Topic Detection And Tracking In Social Network

Posted on:2019-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y B YinFull Text:PDF
GTID:2428330566967004Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularity of social network,especially the massive application of social media represented by micro-blog,micro-blog has become a new way for people to publish and obtain information.Because of the lack of supervision and the popularity of micro-blog,micro-blog has produced some sensitive information,such as violence,terrorism,corruption and reactionary,which have a serious negative impact on social development and national stability.Therefore,the study of sensitive topic detection and tracking of micro-blog can't only help government departments grasp the dynamic of public opinion in time,but also help the real world emergency warning.Due to the limited number of words and the irregular usage of the micro-blog text,the traditional topic detection and tracking technology can't be directly applied to micro-blog text.Based on the traditional topic detection and tracking technology,a simple and effective micro-blog sensitive topic detection and tracking scheme is proposed.In the sensitive topic detection,the traditional Single-Pass algorithm is improved for the long tail effect of micro-blog text and the Single-Pass algorithm is sensitive to the data sequence.Then the initial discovery of the topic cluster is carried out using the improved Single-Pass algorithm,and the LDA topic model is used on the initial micro-blog topic cluster.The probability distribution of each topic keyword in each topic cluster is counted,and the key words with larger probability value are used to express the topic.Finally,sensitive words are matched to find sensitive topics on the well defined sensitive dictionary.In topic tracking,the problem of sparse data and poor text classification results in the representation of micro-blog text with vector space model and unbalance of samples in different categories,two improvements were made to KNN classification algorithm respectively.Convolution neural network is used to extract features from micro-blog short text,and a KNN classifier is used to classify micro-blog short text.In the KNN algorithm,the group similarity function is defined.This function is used to measure the similarity between the samples to be tested and the samples in each category,so as to avoid the impact of the disequilibrium of sample distribution on the KNN classification in different categories,and improve the accuracy of the micro-blog topic tracking.The experimental results show that the improved Single-Pass algorithm proposed in this paper can be effective in topic detection.In terms of topic tracking,compared with the traditional KNN algorithm,the improved KNN algorithm proposed in this paper has a certain degree of improvement in various performance indicators.
Keywords/Search Tags:Topic detection, Topic tracking, Text clustering, Sensitive topic, Text classification
PDF Full Text Request
Related items