Font Size: a A A

Non-blocking Join Algorithm Research On Datastream Environment

Posted on:2009-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:S C LiFull Text:PDF
GTID:2178360278964173Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the popularity of Internet, information technology has been used on some new fields, such as sensor network, on-line trading system, stock quotes real-time monitoring. They require the data should be processed group by group and results should be returned to user quickly, gradually and non-blocking. It is difficult for the traditional technology to reach the requirement in such environment, so data stream query processing technology becomes a new and popular topic in database research area.The relation join algorithm is a critical factor of improving data stream query performance. The time complexity of join algorithm in data stream should adapt to the speed of the data stream, and the space complexity of it should conform to the limited memory. In order to improve the performance of join algorithm, the refresh strategy between memory and disk could be improved by taking advantage of the multi-level storage system.The memory partition named M and disk partition named D are provided for each join stream. The whole join procedure is divided into three stages, MM stage, MD stage and DD stage. The join procedure switches between the three stages varying the transmit speed of data stream, so that the delay time between the two tuples could be used and realize non-blocking join.The critical of improving non-blocking join algorithm is to improve the efficiency of memory join stage.If there are no more space for the coming tuple, some old tuples have to be flushed from memory to disk. A good refresh strategy is very helpful to increase join algorithm performance. The lowest frequently used tuples are searched from the result streams, then flush such tuples from memory to disk so that the tuples that are stayed in the memory would generate more results. Bloom Filter data structure are used in the refresh strategy to satisfy the time complexity and space complexity. After applying the new refresh strategy, the join algorithm performance is increased obviously and it expands the adaptability of the data stream relation join algorithm.
Keywords/Search Tags:non-blocking relation join, memory refresh strategy, data mining, datastream
PDF Full Text Request
Related items