Research And Implementation Of Telecom User Portrait Based On Spark

Posted on:2021-04-14

Degree:Master

Type:Thesis

Country:China

Candidate:M Hua

Full Text:PDF

GTID:2428330629486195

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

User portrait is an important technology in the field of recommendation,first used in the industries of Internet e-commerce,through mining the user's browsing,using big data or machine learning methods for the user to generate a personalized tag,so as to achieve the purpose of drawing,then on the basis of these labels,customers are recommended of their favorite products.This subject of the telecom user portrait system is designed and implemented by using distributed storage technology,parallel computing technology and other big data technologies according to the actual needs of precision market from a province's telecom operators.Taking desensitization data of a province's telecom operators as the data source,the multi-faceted data of users are gathered and combed,and the realized telecom user portrait system is used to construct the user portrait of telecom users from multiple dimensions,accurately grasp the characteristics and needs of users,and then dig out the hidden value and general laws in the data,and apply it to user's commodity recommendation,advertising precise marketing and taking personalized service to the customers,all of which can get practical significance for operators.Especially in carrying out precise marketing activities and shift from network operation to data operation.This paper first introduces the current situation of user portrait application,and points out the problems of traditional methods in dealing with the analysis of massive data and the difficulties of traditional telecom operators in dealing with the increasingly differentiated needs of massive users.On this foundation,this paper proposes the research and implementation of Telecom user portrait based on spark.Secondly,the research and application of big data system from data storage,data operation and data presentation are systematically described,such as Spark SQL,Spark streaming,Hadoop Distributed File System(HDFS).This paper describes the basic principle and the algorithm of text clustering.Compared with the traditional clustering algorithm k-means,the PK-means algorithm in this paper not only reduces the computing resources in the training process,but also retains the characteristics of the main features,which makes the algorithm have good generalization ability.And it analyzes the recent research of Google's pre training neural network language model,Bert(bidirectional encoder representation from transformers),which has a good performance in mining semantic and syntactic relationships between texts and polysemy of a word in natural language and has attracted great attention of the field.Then,this paper carries on the experiment to some telecommunication data,filters and preprocesses the original data,and tests the improved clustering algorithm PK-means model,and analyzes and compares the model results with the traditional algorithm.The test results show that this method has a significant improvement in computing speed and clustering effect.Finally,based on the above results,this paper completed the corresponding visualization software of telecommunication user portrait application system.

Keywords/Search Tags:

Persona, Spark, Bert, text clustering

PDF Full Text Request

Related items

1	System for persona ensemble clustering: A cluster ensemble approach to persona development
2	Research Of Parallel Text Spectral Clustering Algorithm Based On Spark
3	A Research About DBSCAN Text Clustering Based On Spark Platform
4	Research On Chinese Text Summarization Technology Based On BERT-KA-PGN Model
5	Research And Application Of Feature Selection And Text Representation In Text Clustering
6	A Study On Persona Of Take-out Platform Based On The Tags
7	Research On Text Sentiment Analysis Method Based On BERT And Hybrid Neural Network
8	Research On Contextual Text Retrieval Technology Based On BERT And Text Segmentation
9	Research And Implementation Of Parallel Text Clustering Based On MapReduce
10	Research Of The Clustering Algorithm Based On The Spark