Font Size: a A A

Research On Topic Discovery Method For Social Network

Posted on:2016-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2298330452965373Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet technology, the convenience of the Internet makes iteasier for people to communicate with each other in recent years. The fastest growing partof the Internet is social network now. The most typical example of social network is theemergence of microblog, which allows people to express their points whenever andwherever it is possible via mobile phones and computers. The method of finding topic frommicroblog is practical significant because many social hot topics often begin withmicroblog. This paper aims at the discovery of hot topic in social network. A topic detectionmethod is proposed based on classification. Its advantage is to improve the short text topicdetection and optimize the expression of topic detection results.The main work is as follows:Firstly, a topic discovery method is designed based on classification, which make upfor the deficiencies in the original topic detection method which is easy to be confused withthe same keyword in different topic applied to social network text. The method includes thefollowing steps: micro-blog data collection, data pretreatment, text classification, improvedways for topic discovery and topic expression. The text classification and improved topicexpression are the absent in the original topic discovery.Secondly, we use LDA model to improve the accuracy of topic discovery, which isdivided into three steps: segmentation for Chinese words, LDA theme model, themeclustering. In Chinese word segmentation, this article improves the segmentation accuracyby the new word detection module.Thirdly, we try to find hot topics in specific areas, using ontology to crawl trainingcorpus. Then we extend this method to wide area of topic discovery. The article adds theclassification into traditional topic discovery process in order to improve the accuracy oftopic detection. Center line calculation method, which uses central sentences and contentmicroblog, is studied in this paper. And it changes the original topic-word model into acomplete statement directly.Experimental results show that the proposed approach enhances the original topicdiscovery accuracy based on the real-time data on Sina weibo. It is able to find out potentialcentral topic sentence in microblog text with article’s method. The System raised by thisthesis has some practical value and scalability.
Keywords/Search Tags:Social Network, Topic Detection, Latent Semantic Analysis, Classification
PDF Full Text Request
Related items