Font Size: a A A

Research And Application On Short Text Filtering Engine Based On Correlated Topic Model

Posted on:2019-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y XueFull Text:PDF
GTID:2348330542998751Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,with the comprehensive popularization and application of the Internet,our life has already entered the era of information revolution.The most remarkable feature of this era is the information of a large number of forms and different content.The domain of e-commerce,which we know,has the most data in the form of short text.We have challenges when doing data mining research.(1)Data in this domain,just describe a certain event or object,and just content one topic.Traditional model of topic mining,just as correlated topic model,because it has unsuitability to short text,we need to come out a new model or arithmetic to distill and summarize their topic.(2)The major part of content in e-commerce is created by users,so it mixed with personal viewpoint and emotional tendencies that influence the quality of the data.Therefore we have to model the users to mining underlying emotional tendencies.(3)As the data becoming huge,it needs the data mining algorithm to be efficient and extensible.This paper gets three ways to tackle with the questions above.1.Set the correlated topic model as the basic model,and optimize the extraction and prediction function on short text.Design the input and output of the algorithm,and evaluate criteria of the result.Then we get experimental verification of the two algorithms.2.In order to mine the opinion tendency in short text data,a comprehensive model considering the relationships of short text sender,short text description object,short text description content.The model combines some key characteristics of text content,description object characteristics and user bias,constructs vector space matrix for short text data,and effectively models user behavior.Based on this model,two short text processing algorithms are proposed,which not only identify the basic attributes of short text data,but identify users' opinion tendencies.This paper also verifies the proposed two algorithms on e-commerce dataset.3.Add the comprehensive utility to the short text data calculation,and propose the comprehensive effect of this essay data evaluation algorithm.This algorithm can promote the efficiency and the quality of data processing in large-scale data,and can filter the shot text,and recommend the fine test to users.
Keywords/Search Tags:Correlated topic model, Topic extraction, Text opinion mining, Data filtering
PDF Full Text Request
Related items