The Design And Implementation Of A Distributed Tracing System Named TraceUI

Posted on:2020-11-02

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Xia

Full Text:PDF

GTID:2518305732475264

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technology,the services related to Internet applications are no longer distributed on a certain physical machine.They are usually located on different servers and are called from each other via RPC(Remote Procedure Call).When the response time of an application become longer,it is necessary for developers or maintainers to search the service logs related to the application one by one in order to locate the performance shortcoming.However,these logs are often spread across thousands of servers and even across multiple different data centers.Therefore,for application developers,the distributed tracing system is one of the essential functions of the monitoring platform.At present,most of the mainstream distributed tracing systems are designed for the business scenarios of their respective companies,and there are problems such as fixed log format,single function,difficulty in maintaining,and difficulty in evaluating security for other companies to use.In this context,this paper draws on the relevant ideas of Google Dapper,and designs and implements the distributed tracing system TraceUI for our own requirements and production environment.By redesigning the logging API and Span structure,this paper solves the problem of fixed log format and lays a foundation for subsequent function expansion.By designing different samplers,the resource consumption of the logging function is reduced in high concurrency scenarios,and the main services are guaranteed which can still operate normally;By using AOP(Aspect Oriented Programming)to perform the buried operation,the intrusion of the business code is reduced,and the problem of system maintenance is solved.TraceUI uses Kafka as a message publishing system,uses ElastichSearch to build indexes,uses HBase for distributed storage,uses SparkStreaming for streaming computing,uses spring boot and angularJS for front-end development.By collecting logs from different servers from Kafka through Flume,combining with mature distributed middleware technology,we realize a low-consumption,low-latency,high-performance distributed tracing system.The distributed tracing system implemented in this paper has its own features,which not only obtain the detailed information of the service link and accurately locate the fault location,but also show the global application dependencies,providing developers with design and optimization ideas for the overall App at a higher level.The distributed tracking system implemented in this paper has been deployed in the company's production environment,and the application service status is monitored all the time by it,which becomes an essential part of the monitoring platform.

Keywords/Search Tags:

RPC, tracing system, Google Dapper, Kafka, ElasticSearch, HBase, SparkStreaming

PDF Full Text Request

Related items

1	Design And Implementation Of Geographic Name And Poi Data Retrieval System Based On Elasticsearch
2	Design And Implementation Of HBase Data Management Platform
3	The EAST Experimental Data Access Log Analysis System Based On Big Data Technology
4	Desingn And Implementation Of Storage System For Intelligent Application Of Radio And Television
5	Research On GNSS Data Storage And Retrieval Based On HBASE
6	A HBase Based Massive Remote Sensing Metadata Search System
7	Design And Implementation Of Log Real-time Monitoring System Based On SparkStreaming
8	The Design And Implementation Of Distributed Tracing System
9	Design And Implementation Of The Bipartite-network Community Discovery System In Long Time Scale
10	Design And Implementation Of Enterprise Back-end Log Analysis System