The Research And Implementation Of User Attribute Streaming Prediction Based On Multi-label Learning

Posted on:2018-11-01

Degree:Master

Type:Thesis

Country:China

Candidate:J Su

Full Text:PDF

GTID:2348330518496357

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

The Internet is transformed from Web 1.0 era in which "user mainly obtain information" to Web 2.0 era in which "user is obtainer as well as maker of information". In order to discover information or services in the vast amounts of data, customized user profile provides tremendous support and guidance for personalized search, personalized recommendation,advertising and marketing, product strategy.User attribute prediction is the core task of user profile research.Nowadays, the research of user attribute prediction mainly focuses on the prediction model construction of single attribute, and lacks the comprehensive multi-attribute prediction model. Furthermore, data stream mining and the concept drift handling mechanism corresponding user attribute prediction domain are lacked so that the dynamic prediction of the user attributes can not be realized. What's more, there are some limitations which needs to be improved and strengthened accordingly in study of existing concept drift. Based on the above problems, this thesis aims at constructing a user attribute streaming prediction model with perfect efficiency and performance.The idea of predicting multiple attributes at the same time is focused on in the aspect of attribute prediction in this thesis. Based on multi-label learning, MIML is utilized on attribute prediction which is handled as a generalized multi-label classification. Furthermore, clustering method is innovatively utilized to construct the instance concept of user object. Due to approaches above, a model which predicts multiple attributes simultaneously can be constructed quickly, accurately and simultaneously.Different from the off-line prediction model, an on-line streaming framework based on data mining technology to deal with user-generated online behavior is introduced in this thesis, in which various types of concept drift problem is focused on. A prototype-based adaptive concept drift classification algorithm named SyncPrototype is proposed, which makes new optimization in terms of methods of classification method and prototype construction and updating. Experiment result shows that SyncPrototype outperforms existing algorithm in terms of classification performance, time performance and response rate, and is more effective in handling and adapting the drift problem of data flow. SyncPrototype provides strong support for user attributes in the iterative aspect of stream increment, so as to realize user attributes dynamic prediction and streaming iteration.In this thesis, we utilize the user attribute streaming prediction model based on multi-label learning to develop a data mining validation module of the user attribute authentication system, which can effectively verify the authenticity of personal information of microblogging users and measure the credibility of attribute.

Keywords/Search Tags:

user attribute, multi-label learning, data stream, concept drift

PDF Full Text Request

Related items

1	Concept Drift Detection Algorithm Based On Multi-label Learning With Label Special Features
2	Research On Multi-label Data Stream Classification Method Based On Kernel Extreme Learning Machine
3	Research On Class Incremental Learning And Concept Drift Detection In Multi-label Data Streams Classification
4	Research On Concept Drift Data Stream Classification Based On Ensemble Learning
5	Research On Multi-label Data Stream Semi-supervised Integrated Classification Method Based On Cooperative Training
6	Research On Data Stream Classification Method Based On Concept Drift Detection
7	Research On Ensemble Classification Algorithms Of Data Stream Based On Concept Drift
8	Research On Classification Algorithms For Imbalanced Data Stream With Concept Drift
9	Research On Incremental Learning Algorithm And Application For Data Stream With Concept Drift
10	Research On Classification Of Multi-Label Data Streams