Font Size: a A A

Quality Modeling Based Microblogging Filtering

Posted on:2017-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiuFull Text:PDF
GTID:2348330503992903Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Microblogging arising from socialized interaction, including the Twitter, Facebook, Linked In, and Weibo, are gradually predominant in network information streams. Social media has been widely used for people to act, react, and share something or everything going on the scene. However, with the rapid growth of the microblog participants, users facing a dilemma of “shortage of knowledge and the overload of the information”. Microblogging filtering techniques can help users to filter out the irrelevant and spam information. Meanwhile, it can provide the users with relevant information according to the given interests of users. It has been proved that modeling the microblogging filter as the simple information retrieval model can hardly improve its performance. The microblogging filter is faced with the problem of extreme sparsity in words which the short text of microbloggings result in.In summary, we regards the microblog filtering process as a task that optimize the rank of relevant microblogs according to their content, i.e, the microblogging quality. In this sense, we present a novel framework of microblogging filtering.The main contributions are as follows:First of all, we introduce the design of Quality based Microblogging Filtering System. As mentioned above, we model the microblog filter as a optimization task that refine the rank of relevant microbloggings according to their content quality, which raise a novel framework of microbloggings filtering. Then the framework of the proposed microblogging filtering system and the design of key modules were thoroughly discussed.Secondly, we explore the optimized low rank representation of microblogging content with the homophily coefficient constraints. Given the informality of microblogging, the content feature of a microblog tend to be a extremely sparse and high dimensional matrix, which made the research and analysis of the microblogging run into difficulty. To remedy this problem, we try to factorize the microblogging's “content-feature” matrix under the constraint of homophily coefficient. Then generate the microblogging's optimized representation by combining the factorized microblogging's content feature and retrieval feature.Thirdly, to reduce the complexity of the evaluation model and minimize the risk of classification, the sparse feature selection constrained microblogging's quality evaluation model is proposed. The microblogging's quality evaluation model will evaluate the relevant microblogging, and then re-rank the microbloggings according to its quality score. The sparse feature selection constraint could help to reduce the variable correlation among the microblogging quality evaluation function. Therefore, the sparse feature selection constraint is applied to optimize the process of modeling.Finally, in order to explore the effectiveness of the proposed algorithm, several experiments were conducted on the TREC KBA Corpora. The experimental results show that the average NDCG score has obviously improved.
Keywords/Search Tags:Microblogging filtering, Quality model, Matrix factorization, Homophily coefficient constraint, Sparse feature constraint
PDF Full Text Request
Related items