Font Size: a A A

Research On Microblog Retrieval And Microblog Push

Posted on:2019-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:S LiFull Text:PDF
GTID:2428330548995781Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the mobile Internet industry,modern social media,represented by Microblog,has emerged as important ways for people to collect information.Research on Microblog retrieval and Microblog push has attracted the attention of the academic community and industrial circles.Microblog data can be divided into 3 major parts: historical Microblogs,Microblogs posted on the same day,and Microblogs broadcasting in real time.Accordingly,there are three data mining approaches to collect user-interested Microblogs from these three kinds of the Microblog data: Microblog retrieval,Microblog push,which can be divided into Microblog daily email digest and Microblog real-time push.Microblog retrieval refers to the task of searching related Microblogs based on the user's information needs from historical Microblogs and then presenting the sorted results according to their interests.The approach of daily email digest filters out related Microblogs based on users' interesting topics from Microblogs posted on the same day and directly deliver these Microblogs to their emails.The real-time push filters the Microblogs published in real time and obtains posts that appeal to users' interests and then pushes them to the user's smartphone in real time.This article focuses on the three approaches regarding Microblog retrieval,Microblog daily email digest and Microblog real-time push illustrated above with the specific details as follows:1)Microblog retrieval.Traditional information retrieval model may cause serious word mismatch problems due to the nature of short texts of Microblog.This paper proposes a retrieval method based on reference document model.In this method,Microblog data and URL documents with more abundant information are introduced as reference documents respectively,feedback technology is used to map queries and Microblogs on reference documents,thereby more accurately estimating query models and Microblog models,and exploring new theories of short text information retrieval represented by Microblogs.Experiments on the TREC 2011 and 2012 has shown that this method is statistically more effective than the baseline model and improves the performance of Microblog retrieval.2)Microblog daily email digest.Considering aiming at the problem that traditional existing method neglects users' attention to the web pages linked by URL in Microblog,a daily email digest method with merging links is proposed.When calculating the relevance of Microblog and user interest,the Microblog daily email digest method also takes the user's information requirement needed for both the Microblog and web pages which the URL links to into account,which greatly improves the performance of the Microblog daily email digest task.The experiments on the TREC 2017 has shown that this method is better than multiple baseline methods.In the case of participating in 15 well-known research institutions including the Peking University,University of Delaware,and the Philips Institute of Artificial Intelligence in North America,the daily email digest won the first place on the nDCG@10-p,which is the major evaluation index and was better than the second place by more than 5%.Therefore,the effectiveness and advancement of this method can be verified.3)Microblog real-time push.Microblog can be categorized into three different correlations: highly correlated,related,and unrelated according to user's interests.Considering that traditional method based on the two classifications fails to recognize the difference between the correlation of Microblog topics and the associated users' interests.This paper transforms the Microblog real-time push task into a learning to rank problem,and proposes an innovative real-time push method based on learning to rank.The experiments on the TREC 2017 has shown that this method is better than baseline methods.In the case of participating in 15 well-known research institutions including the Peking University,University of Delaware,and the Philips Institute of Artificial Intelligence in North America,this innovative real-time push method won the first place on the EG-p and was better than the second place by more than 9%.Therefore,the effectiveness and advancement of this method can be verified.
Keywords/Search Tags:Microblog retrieval, Email digest, Microblog push, Real-Time push, Microblog
PDF Full Text Request
Related items