Font Size: a A A

Data Stream Disorder Processing Technique Based On Time Correlation

Posted on:2019-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:X B WeiFull Text:PDF
GTID:2428330545966142Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the continuous development and application of Internet and big data technology,there are endless streams of data.The generation of data stream is infinite,continuous,dynamic and real-time,so the analysis and processing of data stream requires fast and effective results,so as to ensure the effectiveness of the analysis results.However,the problem of disorder in data streams can lead to the loss of data stream processing results,which brings great challenges to the analysis and processing of data streams.In the analysis and research of data stream,the similarity join of data streams is an important basic operation,which is often used for data stream mining and analysis.The disorder problem seriously destroys the integrity of data stream processing results.In this paper,the problem of the similarity join of data stream based on sliding window semantics is studied,and the method and technology of out-of-order data stream join processing under quality driven are discussed,and the following research work is carried out:(1)we propose a quality driven disordered data stream join processing technology QJoin.The technology uses caching technology and symmetric connection strategy to ensure that stream tuples can be analyzed in real time to reduce the average waiting time of stream tuples and improve the processing rate of disorder data stream join processing based on sliding window semantics.Based on the concept of quality driven,the size of the cache is optimized by collecting statistical data in the process of join processing,making it possible to reduce the memory cache of the late tuples by reducing the memory cache of historical data to meet the user's result quality requirements,thus reducing the memory overhead of the system.The experimental results on real data sets show that,compared with the traditional data stream processing technology MP-K-slack,QJoin technology can not only ensure the real-time analysis and processing of the flow tuples of the data stream,but also significantly reduce the memory usage overhead,while meeting the user's quality requirements(2)The dynamic change characteristics of data flow are analyzed,and a load shedding strategy is proposed on the basis of the limited QJoin cache,aiming at the continuous acceleration of the system.When the data flow rate is too high,the redundant tuples are filtered out properly based on the time correlation,in order to reduce the system load and improve the ability to respond to the system's continuous overload problem.The experimental results show that,through the dynamic change experiment of data stream input,it is proved that the load shedding strategy based on QJoin can effectively deal with the continuous overload of the system.From the point of view of instant processing of data stream application and the quality requirement of user results,this paper proposes a quality driven disordered data stream join processing technology QJoin,which can effectively reduce the memory overhead of the system under the premise of satisfying the quality requirements of the user,improve the query efficiency of data stream similarity join,and provides an effective solution for the processing of disorder data stream similarity join problem based on sliding window semantics.It can widely support applications the object tracking in video stream,trend monitoring and harmony analysis and so on.The research of this paper has certain scientific significance and application value.
Keywords/Search Tags:Quality-driven, Join processing, Qut-of-order data streams, Storage consumption
PDF Full Text Request
Related items