Design And Implementation Of User Portrait System Based On Spark

Posted on:2022-07-11

Degree:Master

Type:Thesis

Country:China

Candidate:Y Hou

Full Text:PDF

GTID:2518306722472884

Subject:Master of Engineering

Abstract/Summary:

With the rapid development of the Internet today,the amount of information in the world is increasing "explosively",For Internet enterprises will face huge amounts of data every day,how to extract these seemingly random redundancy of data,processing,analysis and use,to maximize the value of the data,used to build user portrait,facilitate recommendation system or advertising system for precise recommendation or advertising,has become an important problem enterprises can not be ignored.How to deal with the problem of "user portrait" will greatly affect the revenue of enterprises,and also affect the development space of enterprises in the field of "user portrait".This paper mainly studies the design and implementation of user portrait system based on Spark framework.The system can not only provide data support for advertising delivery or accurate recommendation system,but also help enterprises to make operational decisions by visualizing user portrait data.The whole system is divided into data preprocessing and storage module,user portrait building module,data visualization module.The core part is the user portrait building module.Based on massive user data,the module uses the Spark framework to calculate user portrait labels in parallel,namely statistical labels,matching labels and mining labels,which solves the performance bottleneck of traditional data processing methods under massive data.This paper focuses on the design and implementation of mining tags.In order to improve the computing performance of mining tags,the parallelization of Softmax algorithm is studied based on Spark’s parallelization computing ability,which solves the multi-classification problem on massive data and improves the computing performance of user loss risk tags.In addition,this paper also studies the parallelization of naive Bayes classification algorithm based on TF-IDF and mutual information weighting,improves the traditional naive Bayes algorithm by means of weighting,and applies it to the parallel calculation of user comment category tags based on Spark framework.In this paper,experiments are carried out on the Spark platform to conduct distributed training and prediction of the mining label computing model in the way of data parallel,and the effectiveness and execution efficiency of the model are evaluated.Finally,this paper based on the Spring framework,Spring MVC framework,My Batis framework and Echarts framework design and implementation of data visualization module,and user portrait data visualization output.

Keywords/Search Tags:

Big Data, User Portrait, Tags, Ddistributed, Parallelization

Related items

1	A User Portrait Method Based On Multi-Source Weighted Fusion
2	Design Of User Portrait System Based On Big Data
3	Design And Implementation Of User Portrait System Based On Microblog Data
4	Research On User Portrait Application Based On Flink E-Commerce Operation Platform
5	Design And Implementation Of Data Service System Based On User Portrait
6	Design And Implementation Of OTA User Portrait Based On Big Data Technology
7	Research And Implementation Of User Portrait Algorithms Based On Personal Data
8	Research On The Application Of Intelligent Deduction In User Portrait
9	Analysis And Research Of User Portrait Construction Algorithm Based On Behavior Data
10	Research And Implementation Of User Portrait System Based On Weibo Data