Font Size: a A A

Research And Implementation Of Distributed Data Processing System

Posted on:2014-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:S S ZhuFull Text:PDF
GTID:2248330395484035Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the universal application of network and accelerating rate of data generation, it hasbecome an important topic on how to process data effectively. Traditional data processing systemtakes static finite dataset as its processing object and obtains the result by calling a static queryscript. Nowadays, with data becoming continuous, rapid and time-varying, the shortcomings oftraditional data processing system gradually appears which makes it unable to meet the needs ofmodernization work. In the new background of application, higher functionality and performance ofdatabase system is required as well as the automation and intellectualization of data processing.The thesis first analyzes the research status of distributed data processing system at home andabroad. Then, the thesis designs a distributed data processing system scheme and discusses its twokey technologies. Finally, the scheme is applied in a packet transmission system.Two key technologies of distributed data processing are studied in this thesis: query and loadshedding. Unlike traditional static query, continuous queries always execute the query and outputquery results while data sources uninterruptedly access to the system. I study the characteristics ofcontinuous queries and analyze the allocation of memory space, then design a greedy strategy basedon dynamic sliding window query algorithm to improve the efficiency of continuous queryprocessing.Load shedding is a method implemented for load fluctuations caused by irregular data flowrate. The arrival rate of the data stream is usually unpredictable. When the input rate exceeds systemcapacity; system overload occurs and causes deterioration of the performance of the system. Thisthesis designed a intelligent algorithm based on ant, which meet the conditions to find out theoptimal path and makes full uses of the idle compute nodes.
Keywords/Search Tags:distributed system, data stream, continuous query, load shedding
PDF Full Text Request
Related items