Font Size: a A A

Research And Implementation Of Telecom User Portrait Based On Spark

Posted on:2021-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:M HuaFull Text:PDF
GTID:2428330629486195Subject:Computer technology
Abstract/Summary:PDF Full Text Request
User portrait is an important technology in the field of recommendation,first used in the industries of Internet e-commerce,through mining the user's browsing,using big data or machine learning methods for the user to generate a personalized tag,so as to achieve the purpose of drawing,then on the basis of these labels,customers are recommended of their favorite products.This subject of the telecom user portrait system is designed and implemented by using distributed storage technology,parallel computing technology and other big data technologies according to the actual needs of precision market from a province's telecom operators.Taking desensitization data of a province's telecom operators as the data source,the multi-faceted data of users are gathered and combed,and the realized telecom user portrait system is used to construct the user portrait of telecom users from multiple dimensions,accurately grasp the characteristics and needs of users,and then dig out the hidden value and general laws in the data,and apply it to user's commodity recommendation,advertising precise marketing and taking personalized service to the customers,all of which can get practical significance for operators.Especially in carrying out precise marketing activities and shift from network operation to data operation.This paper first introduces the current situation of user portrait application,and points out the problems of traditional methods in dealing with the analysis of massive data and the difficulties of traditional telecom operators in dealing with the increasingly differentiated needs of massive users.On this foundation,this paper proposes the research and implementation of Telecom user portrait based on spark.Secondly,the research and application of big data system from data storage,data operation and data presentation are systematically described,such as Spark SQL,Spark streaming,Hadoop Distributed File System(HDFS).This paper describes the basic principle and the algorithm of text clustering.Compared with the traditional clustering algorithm k-means,the PK-means algorithm in this paper not only reduces the computing resources in the training process,but also retains the characteristics of the main features,which makes the algorithm have good generalization ability.And it analyzes the recent research of Google's pre training neural network language model,Bert(bidirectional encoder representation from transformers),which has a good performance in mining semantic and syntactic relationships between texts and polysemy of a word in natural language and has attracted great attention of the field.Then,this paper carries on the experiment to some telecommunication data,filters and preprocesses the original data,and tests the improved clustering algorithm PK-means model,and analyzes and compares the model results with the traditional algorithm.The test results show that this method has a significant improvement in computing speed and clustering effect.Finally,based on the above results,this paper completed the corresponding visualization software of telecommunication user portrait application system.
Keywords/Search Tags:Persona, Spark, Bert, text clustering
PDF Full Text Request
Related items