Font Size: a A A

User Interest Model And Analysis Based On Data Management Platform

Posted on:2016-07-05Degree:MasterType:Thesis
Country:ChinaCandidate:T S ZhanFull Text:PDF
GTID:2298330452966402Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of information technology, Internet users are no longer satisfied withtheir old habits such as net surfing and online chatting, but hope they can receive recommendationservice with their interest, hobby, character and behavior. Technologies about Data ManagementPlatform (DMP) and User Interest Model thus emerge, which has received widely research andapplication in recent years.As one of the core content of Data Management Platform, the target of building a UserInterest Model is to analyze users’interest and behavior from their mass search data. Therefore theeffectiveness and usability of Data Management Platform depend largely on the accuracy of UserInterest Model.A User Interest Model based on Data Management Platform and a User Interest AnalysisSystem based on User Interest Model is proposed in this paper with users’ mass search data. TheMapReduce programming model under the Hadoop distributed system framework and Hive datawarehouse are used to implement User Interest Analysis System and storage the input data andoutput data because of the large amounts of user data. In summary, results of the work in thispaper are mainly reflected in the following aspects:1) The user interest weight list is backtracked recursively which combines TF-IDFAlgorithm with Vector Space Model (VSM) based on users’ search data andclassification data of E-commerce site so that User Interest Model can be built.2) A forgotten mechanism is introduced to update User Interest Model dynamically so thatthe problem that users’interest will change with time can be solved.3) The MapReduce programming model under the Hadoop distributed system frameworkis used to implement User Interest Analysis System so that the time-consuming problemof the proposed system can be solved. 4) Precision and Recall are used as double standards to evaluate the performance of UserInterest Analysis System. The system is both implemented in the stand-aloneenvironment and Hadoop distributed environment to compare the time performance ofthe proposed system. Experimental results show that User Interest Analysis System isboth practical and useful.
Keywords/Search Tags:Data Management Platform (DMP), User Interest Model, Hadoop, TF-IDF, Vector Space Model (VSM)
PDF Full Text Request
Related items