Font Size: a A A

The Design And Application Of Stream Computing Model For Massive Data

Posted on:2017-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:S X GuoFull Text:PDF
GTID:2428330488979865Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid popularization of emerging technologies and application models like cloud computing,mobile internet and social network,the amount of data is increasing quickly and the age of massive data has come.How to extract valuable information from massive data has become a challenge and an opportunity for the calculating of big data.Meantime,data devalue with the time,which means data needs to be dealt with as soon as possible.For example,data from the stock exchange are highly real-time,massive and continuous,where the traditional batch computing model is no more applicable.To meet the needs of real-time computing of massive data,the flow computing model is raised.So-called flow computing mode sees data as a data flow which is a set of data records,the smallest component of a data flow,infinite in time and quantity.The changing data flows are analyzed in real time,the valuable information is detected and the results are obtained.This paper are organized with the application and the optimization of the flow computing model as the core and the current research situation of the flow computing systems is analyzed first.Then,considering the features of data flows such as real-time,volatility,burstiness,randomness and unlimitedness,the limitation of the existing massive data flow computing systems on the burstiness,fault tolerance and load balancing is discussed.On that basis,a flow computing model with high flexibility and reliability is established.The model abandons the static routing strategy which fixes the node number and the routing relation between the data and nodes at power-up and adopts the dynamic routing strategy which groups the routing information and accesses the routing information by number and group.The grouped routing information can be modified via the provided access freely and greatly increases the flexibility and thus handles the burstiness correctly.In addition,by tracking the execution path of the information,the model confirms its reliability and when the information is lost because of the malfunction of the nodes,a retransmission system will be trigged and the loss can be avoided.Meanwhile,a XOR-resembled tracking strategy limiting the memory consumption of every information is adopted to save the tracking expanse.Finally,the model is applied and realized on a practical application:abandoned tickets disposal system.The developed system effectively realizes the exception monitoring,locating and informing and abandoned tickets and,thus decreases the billing loss.The effectiveness of the dynamic routing strategy on dealing with the data burst and lowering the delay is validated by field data.
Keywords/Search Tags:Massive Data, Batch computing, Flow computing, Dynamic routing, Message-level fault tolerance
PDF Full Text Request
Related items