Font Size: a A A

Design And Implementation Of A Unified Analysis Platform For User Feature Data Based On Hadoop

Posted on:2015-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:B H YeFull Text:PDF
GTID:2298330422992351Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the growing number of Internet users, the amount of userbehavior information is increasing, its growth rate is hard to imagine, according tothe statistics of a company which is focus on user behavior analysis, beforeselecting a product, a user browses5websites,36pages, interacts with searchengine or social media dozens of times on average. Through the analysis andunderstanding of the information, develop service and personalizedrecommendation to customers, which is beneficial to the company and the users. Atpresent, more and more applications are developed for user feature data analysis,which leads to scattered data, repeated workload problems, the lack of a unifiedscheme to solve such problems, so developing a user feature data analysis platformbecome increasingly necessary.This thesis first introduces the research status of processing data on Hadoop,then the paper explains the current research situation of user feature data, whichprovides the necessary reference for the development of the system, and introducesthe applications related user feature data analysis. In the requirement analysis stage,paper shows the business requirements, functional requirements and non functionalrequirements. According to the requirements of the platform, design in detail,according to different functions, the platform is divided into4models: dataacquisition module, preprocessing module, model establishment module and querymodule. Data acquisition module supports different ways. Preprocessing moduleprovides a preprocessing framework, it can define operation sequence andoperation content according to the demand, which makes the preprocessing processmore convenient and flexible, the platform supports encapsulation, segmentation,feature extraction, weight calculation, data formatting and other preprocessingoperation. In the model establishment stage, platform realizes the support vectormachine classification algorithm and naive byes classification algorithm, generatesmodels through training, which is for the prediction. Feature query module providesthe remote calling interface, realizes the LRU buffer, which improves the platformperformance. In order to reflect the availability of the platform, applications whichincludes the user gender recognition, age recognition, consume ability recognitionare developed based on platform. In the phase of testing platform, testing scheme isdrawn up, tests case is tested on the function of each module. Finally, platform teststhe effect of recognition application, the accuracy rate meet requirements. Thethesis finally summarizes the problems the platform solved and future direction of improvement.At present, the platform has been put into use, providing analysis of userfeature data for different applications.
Keywords/Search Tags:Hadoop, user feature data, feature analysis, recognition application, feature query
PDF Full Text Request
Related items