Font Size: a A A

Research On Real-time Processing Technology For Big Data Of Nested Internet Of Things

Posted on:2022-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:S X WuFull Text:PDF
GTID:2518306494471314Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,the technology and application of Internet of things have developed rapidly.Real-time processing of massive data generated by Internet of things devices is of great significance for improving data value density and responding to business events quickly.There are many terminal devices in the Internet of things,and the data generated through the complex and unstable network may be out-of-order.The data analysis and query results of out-of-order data will be wrong,which will affect the business decision-making.In addition,with the development of intelligent terminals,nested structure has gradually become a common format of Internet of things data.The design of nested data storage and query methods to improve the speed of data retrieval needs to be concerned.Based on the above analysis,this paper studies the specific continuous top-k query and nested data storage query methods.1.Facing the problem of inaccurate continuous top-k query results in out-of-order stream,a continuous top-k query method for high-speed out-of-order stream is proposed.Firstly,a self-adaptive cache duration algorithm is proposed to alleviate the contradiction between cache duration and query accuracy.Furthermore,Min Topk query algorithm is modified to adapt to the self-adaptive cache duration algorithm.The experimental results show that this method can reduce the cache duration on the premise of ensuring the accuracy of a given Top-k query.Compared with the Min Topk algorithm without cache,the average accuracy can be improved by nearly 50%.2.The Internet of things system produces a large number of nested data.Facing the problems of how to organize nested data and how to improve the retrieval of nested data,this paper proposes a nested data storage method based on the general distributed database HBase.By analyzing the retrieval requirements of nested data,the table structure of HBase is designed and the secondary index is constructed.The experimental results show that the method can quickly retrieve data on the premise of ensuring the efficiency of real-time writing.3.Based on the above research,a real-time processing system for nested Internet of things big data is designed and implemented,and the continuous top-k query module and the specific implementation method of data storage are introduced.
Keywords/Search Tags:real-time processing, out-of-order data stream, continuous top-k query, nested data
PDF Full Text Request
Related items