Font Size: a A A

Research On Topic-oriented Authoritative Information Retrieval Model In Microblog Site

Posted on:2014-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:P YangFull Text:PDF
GTID:2268330392473537Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of social network service website, at present, alarge number of social networking sites appear in the world, such as Facebook, Twitter,Linkedin in USA, Sina Weibo, Tencent Weibo, Netease Weibo, Renren Network inour country. The emergence of these social networking sites have changed the waypeople get information. In the past, people only as an “Information Consumer” who getinformation on the Internet produced by others, at present, people swich to an“Information Producer” who not only can get others’ information but also can releaseinformation by themselves. Take the microblog website as example, users withdifferent backgrounds can post their own information every time through the platformof the microblog site. It will produce millions of microblog information in microblogwebsite everyday. Facing with such a large number, quantity dynamic change,timeliness strong microblogs, the user wants to find their interested in topics to thecorresponding authoritative information has become a challenge. Because of the abovereasons, the research of topic-oriented authoritative information retrieval model inmicroblog site has important theoretical significance and practical significance.On the basis of predecessors’ research, using microblog website as the researchobject in this dissertation, aiming at the inherent sparsity and strong timeliness aboutmicroblog, it is studied the retrieval problem of Topic-orientation authoritativeinformation on microblog site in this dissertation. Because microblog word countlimit within140words, so microblog information data sparseness is inevitable.Aiming at the inherent sparsity and strong timeliness about microblog, in thisdissertation, firstly, the method extracting the implicit theme of microblog is presented,which can effectively ease sparsity problem about microblog short text data.Furthermore, a two-stage clustering algorithm which can speed up searching speed,itis applied into microblog site to classify information by topics. Finally, an efficientlyrank model is proposed.this Ranking Model combined with the KL-divergencelanguage model and time factor to rank the microblog, And by using two stage pseudorelevance feedback technology to extend the original query, At the same time use theurls embedded in the microblog to extend original document model. A series ofexperiment on real datasets from Sina microblog were conducted. Experimentalresults demonstrated that our implicit theme model can considerably solve datasparseness problem, and the rank model of authoritative information has betterperformance in terms of real-time information search.
Keywords/Search Tags:Microblog Site, Implicit Theme, Clustering, Sort Model of AuthoritativeInformation, Pseudo Relevance Feedback
PDF Full Text Request
Related items