Font Size: a A A

Research On Social Media Text Mining:Algorithms And Applications

Posted on:2018-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:D ShangFull Text:PDF
GTID:2348330512997191Subject:Computer science and technology
Abstract/Summary:PDF Full Text Request
Social network is a product of rapid development of Internet.Microblog platform such as Twitter can collect rich messages everyday.And we can make use of such mes-sages in all kinds of applications in the Internet.Text mining is a popular and important theme in social networks.Traditional methods of text mining are applicable for normal news text and tools of NLP can provide reliable pre-processed results.However,these approaches perform badly in micro-blogging data.The micro-blogging data has sev-eral quite different features from news data,such as short text,nonstandard expression and lots of unknown words.It brings us new opportunities and challenges.Aiming at these features of micro-blogging data,we proposed new methods to solve two impor-tant tasks of social media analysis.One task is user tag extraction,the other is event detection.In new media era,users post messages to record their daily lives and express their opinions via social media platforms,such as microblog.Recently,it is an attractive topic to tag users from the users generation contents.Tags for a microblog user,as the description for his/her interests,concerns or occupational characteristics,are playing an important role in user indexing,personalized recommendation,and so on.Previous works apply keyword extraction methods to present the interests of users.However,it is hard for keyword extraction to give accurate results when the data is deficient and noisy.In this paper,we propose a novel method to tag the users.Firstly,we apply feature selection via sparse classifier to generate preliminary tags for users.Then we also apply feature selection method to extend the tags.Finally,we refine the tags with a reranking strategy.We conduct our experiments on the data of the most popular Chinese microblog(Sina Weibo).The experimental results show that our method improves the performance significantly over other methods.Event detection in Twitter is an attractive and hard task.Existing methods mainly consider words co-occurrence or topic distribution of tweets to detect the event.Few of them consider the time-series information in the text stream.In this paper,for event detection in twitter,we propose a novel multi-view clustering model which can consider both topic information and time-series information.First,we build a topic similarity matrix and a time-series similarity matrix by using the topic model and the wavelet analysis,respectively.Then,the multi-view clustering algorithms are used to group keywords.Each cluster of keywords is finally represented as an event.The experiments show that our method achieves better performance than other related works.
Keywords/Search Tags:Text Mining, Social Media Analysis, Microblog, Keyphrase Extraction, Sparse Model, Feature Selection, Event Detection, Multi-view Clustering
PDF Full Text Request
Related items