Design And Implementation Of A Cloud Based Customer Retention System For China Unicom

Posted on:2019-06-12

Degree:Master

Type:Thesis

Country:China

Candidate:Z Wei

Full Text:PDF

GTID:2428330590975230

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of cloud computing and Internet technology,computing and storage function and use open source software development system to improve or replace existing systems become a trend,like Chinese Unicom was also involved in the traditional industry,this is also a trend of technological change.China Unicom's customer retention system is a subsystem of China Unicom's centralized service support system.It is based on customer information,detailed customer bills,customer payment records,and other consumption data of the China Unicom database for statistical analysis,and establishes a set of customer loyalty.The statistical model of degree,satisfaction,and credit,and then formed a variety of possible reports that affect customers and predict customer churn,to facilitate customer managers to take steps to prevent customer churn,to achieve the purpose of retaining customers.The architecture of the system is the traditional IT architecture,which has a poor expansibility,overdrafts the ability of the database,has a huge performance ceiling and a poor user experience.This thesis focuses on the use of the mainstream distributed computing engine Apache Spark and related components within the Hadoop ecosystem to build a linearly scalable cloud computing platform.This platform integrates distributed computing,distributed storage,and load balancing to existing Unicom customers.Retaining system for cloud transformation,research work is divided into real-time business and non-real-time business in two parts,the following lists the main work:The real time business section is carried out from three parts of data collection,flow computing and data storage.The data collection part mainly completes the data connection between the existing system and the new cloud system,and real-time business data collection and high real-time requirement of data,only a second delay allowed,so the use of structured data replication of Oracle backup software from real-time incremental data files of existing systems,using Kafka cluster production increment message,then the new cloud system by Spark cluster real-time consumer Kafka news,real-time reflect incremental changes.Secondly,the flow calculation part mainly uses Spark's streaming computing technology to process data streams,including raw data filtering,data analysis and processing,effective data filtering,timing processing and warehousing operations.Aiming at the abnormal situation,such as data loss,error data and so on,there will be data recalculation process.Finally,the data storage section uses the HBase+Redis mode,and the Redis stores the temporary data for 48 hours.HBase stores persistent data for 48 hours.Non real time business are carried out from four parts of data collection,dynamic programming,process scheduling and process engine.Because the real-time data collection of non real time business is not very demanding for real-time data,so we use Spark-Sftp to automatically extract data source files and distribute them to HDFS(distributed file system)cluster of new cloud computing system.The dynamic programming part is to use Javassist bytecode technology to process external data sources and convert them into DataFrame structure data that SparkSQL can process,so as to restore SQL's business logic.Flow arrangement part mainly through simple configuration process to complete the complex business logic in an existing system(Storedprocedure),adopts Xiorkflow open source JS framework with Tapestry to build a set of independent Web system,the realization of the interface design process of the real-time business logic is simple,configurable non complex.The process execution engine mainly loads the executable process in real time through the way of resident process,then parses the task nodes in the process,arranges the execution sequence of tasks according to the way of directed acyclic graph,and executes specific tasks in the thread pool of Spark.Finally,using the Spark based cloud system to verify the feasibility and advantages of the related research,it also shows the effectiveness and practicability of this work.

Keywords/Search Tags:

Spark, Kafka, Big data processing

PDF Full Text Request

Related items

1	Design And Implementation Of Reporting System Based On Spark Platform
2	Design And Implementation Of Kafka-based Full-Link Stream Data Processing Platform
3	Big Data Flow Processing Analtsis System Based On Kafka
4	Research And Implementation Of Test Data Processing System Based On Spark Streaming
5	Design And Implementation Of A Cloud Based Customer Retention System For China Unicom
6	Research And Implementation Of DSP Data Warehouse Optimization Based On Spark
7	Research And Application Of Kafka-based E-commerce Enterprise Search Engine Data Comprehensive Processing System
8	Design And Implementation Of Telecom 4G Big Data Platform For Network Optimization Based On Spark
9	Real-time Detecting Of DDoS Attacks Based On Spark-streaming
10	Design And Implementation Of Real-time Log Stream Processing System Based On Kafka And Storm