Font Size: a A A

The Design And Implementation Of Business Service System For Log Analysis Based On Big Data Technology

Posted on:2019-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:J W GuFull Text:PDF
GTID:2428330566986574Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of big data,various business services require scenes based on logs for the purpose of ensuring health status or digging deeper values.In the face of massive log data that has been continuously generated and growing exponentially,traditional data processing and analysis technologies are hard to meet specific performance requirements in computing and querying services.The distributed and parallel big-data technology can give full play to the multi-machine and multi-core hardware resources,and has gradually been favored by the academic and industrial circles in the field of log service analysis.Firstly,log data usually has time-series and flow characteristics,and has certain attribute meanings.Secondly,in the construction of business process,the phase and relevance of business processing can reflect the corresponding relationship between task flow and data flow at the bottom.In addition,the agile development and production deployment of big data projects has always been one of the most concerned issues or problems of organizatio ns or enterprises.In order to process and manage massive log data,as well as develop specific business applications efficiently,for the sake of performance and generality,this thesis designs and implements a business service system for big data analysis based on the distributed computing framework Spark.The major contributions are as follows :(1)According to the characteristics of log generation,access and processing,this thesis proposes a hierarchical system architecture and designs three functional modules DSService,SparkServer and MonitorServer with low coupling and support distributed services.The communication and invoking methods between each layer of the architecture,function modules,and services are designed to support task flow or workflow management and scheduling,and to provide fault-tolerance,high-efficiency,and scalability guarantees for each service.(2)With Spark DataSet,this thesis unifies the application pattern of big data batch processing and streaming processing,forms a business workflow system that constrasts data flows and task flows,and at last realizes a unified development and deployment mode that supports data pipeline modeling.(3)Integrated SDK is provided to shield the underlying complex operations as well as support service registration and discovery,disaster tolerance and system monitoring.Combined with the management platform,this thesis provides users with integrated business application design flows for data access,development,deployment,and visualization,thereby facilitating the rapid integration and implementation of business service applications.Finally,this thesis conducts three benchmark tests: data access,task calculation,and data query.It shows that the basic big data services provided by the system have excellent performance and scalability.And then through two specific business applications,it verifies the versatility and practicality of the system in analyzing business services in big data logs.
Keywords/Search Tags:Spark, distributed services, data pipeline modeling, Integrated SDK
PDF Full Text Request
Related items