Font Size: a A A

Research On Cache Architecture For Real-Time Processing Of Streaming Data

Posted on:2018-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:S F LiFull Text:PDF
GTID:2428330566451414Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the advent of big data era,Internet,Internet of things and financial fields etc.produce massive streaming data.However,it is very challenging to process stream reliably due to the characteristics of high speed,randomness,disorder,infinity of streaming data.Meanwhile,more and more applications need to combine massive historical data with real-time analysis of streaming data and store the processing result for user queries.Besides the timeliness,the stability and correctness of the stream processing system are also very important.So it is urgent to propose a cache processing architecture to ensure the timeliness,stability and correctness of stream processing system.Considering the timeliness,stability and correctness of the processing system.An multilevel cache model named HCache for real-time process on streaming data is proposed,HCache is consist of hash based online cache and batch cache.Online cache stores the results of same batch into the continuous storage space in the same bucket,and it uses the loop logic cache structure and the results from batch cache to automatically eliminate expired data,the hash based structure is designed to improve the efficiency of data storage and access.Batch cache stores the recently accessed results from the persistence database,and it uses improved LRU(Least Recently Used)elimination strategy to effectively eliminate expired cache results to reduce memory usage and improve hit rate.The result of online cache is used to update the batch cache,this update strategy ensures the consistence of results.As to user query requests,the requests come to HCache would access to both online cache and batch cache,and the results would be merged before returning to user.To verify the performance of HCache,the methods are tested compared with current popular storage structure under a Twitter dataset.The experimental results show that HCache improve the read and write performance significantly compared with Summingbird.And HCache is less affected by the change of query requests arrival rate.Also,HCache uses less memory.
Keywords/Search Tags:Streaming data, Real-time processing, Cache model, Merging mechanism, Replacing strategy
PDF Full Text Request
Related items