Font Size: a A A

The Research And Implementation Of User Attribute Streaming Prediction Based On Multi-label Learning

Posted on:2018-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:J SuFull Text:PDF
GTID:2348330518496357Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The Internet is transformed from Web 1.0 era in which "user mainly obtain information" to Web 2.0 era in which "user is obtainer as well as maker of information". In order to discover information or services in the vast amounts of data, customized user profile provides tremendous support and guidance for personalized search, personalized recommendation,advertising and marketing, product strategy.User attribute prediction is the core task of user profile research.Nowadays, the research of user attribute prediction mainly focuses on the prediction model construction of single attribute, and lacks the comprehensive multi-attribute prediction model. Furthermore, data stream mining and the concept drift handling mechanism corresponding user attribute prediction domain are lacked so that the dynamic prediction of the user attributes can not be realized. What's more, there are some limitations which needs to be improved and strengthened accordingly in study of existing concept drift. Based on the above problems, this thesis aims at constructing a user attribute streaming prediction model with perfect efficiency and performance.The idea of predicting multiple attributes at the same time is focused on in the aspect of attribute prediction in this thesis. Based on multi-label learning, MIML is utilized on attribute prediction which is handled as a generalized multi-label classification. Furthermore, clustering method is innovatively utilized to construct the instance concept of user object. Due to approaches above, a model which predicts multiple attributes simultaneously can be constructed quickly, accurately and simultaneously.Different from the off-line prediction model, an on-line streaming framework based on data mining technology to deal with user-generated online behavior is introduced in this thesis, in which various types of concept drift problem is focused on. A prototype-based adaptive concept drift classification algorithm named SyncPrototype is proposed, which makes new optimization in terms of methods of classification method and prototype construction and updating. Experiment result shows that SyncPrototype outperforms existing algorithm in terms of classification performance, time performance and response rate, and is more effective in handling and adapting the drift problem of data flow. SyncPrototype provides strong support for user attributes in the iterative aspect of stream increment, so as to realize user attributes dynamic prediction and streaming iteration.In this thesis, we utilize the user attribute streaming prediction model based on multi-label learning to develop a data mining validation module of the user attribute authentication system, which can effectively verify the authenticity of personal information of microblogging users and measure the credibility of attribute.
Keywords/Search Tags:user attribute, multi-label learning, data stream, concept drift
PDF Full Text Request
Related items