Font Size: a A A

The Research And Application On Microblog Automated Classification

Posted on:2013-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:B JiangFull Text:PDF
GTID:2268330392468452Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Microblogging is a new network communication and information-sharing platformin recent years, the registered users’s number of the microblogging in China has beenmore than300million, due to the nature of the spread way of microblogging and thespeed of information content created, each user has to deal with the test of overloadinginformation, and the information for each user on the home page of microblogging ischaotic, there is no function of automated classification for platform itself, so thatusers can not view the most concerned and interested microblog information at the firsttime. This paper will study on the microblogging text classification, and on this basis,we will analyze the classification application combined with the microblogging users’interest.First, this paper analyzed the research results of domestic and international textclassification, summaried the similarities and differences between microbloggingclassification and text classification and made a comparative analysis, then found amethod to sovle the existing problem of microblogging text classification. Second,through observing the microblogging text data, we summaried the type ofmicroblogging text, the structure of microblogging text and the language characteristicsof microblogging text, in this basis, I defined the relevant elements of the text of themicroblogging and built the model of microblogging text, and I designed the strategy ofdata colectiong and data storage, and seleceed a microblogging text segmentationmethod. Third, I built a microblogging text classification system through analyzing thedistribution of four portal categories and the microblogging text category system of sinamicroblogging platform, and I built a characteristic pattern library with web texts,whose categories is the same as the microblogging text system, then I designed amethod which removed the repeated words to optimize the characteristic pattern library.Finally, I put forward a microblogging text classification algorithm which matched themicroblogging text feature words and characteristics pattern library’s feature words, theclassification algorithm could automatically identify the microblogging text. Then Iverified the effectiveness and feasibility of the classification algorithm with themicroblogging text data and researched the microblogging classification with theinterest of users.This paper results will be of great value for the users, and will lead to thecompanies of the microblogging platform to update technology means to better servicethe users and society.
Keywords/Search Tags:microblogging user interest, microblogging text classification, characteristic pattern library
PDF Full Text Request
Related items