Font Size: a A A

Research Of Similarity Query Algorithms On Real-time Data Stream

Posted on:2016-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y GuoFull Text:PDF
GTID:2308330470975823Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the research in the field of data is processing deeper and deeper, and the content of its development is becoming more and more, researchers gradually found that the data needs to be processed in the form of stream. These data has a high transmission speed, a large-scale, long transmission duration, and can only be read by a limited few times. For the unlimited data stream, the limited memory of system can only get a similar solution under a certain precision. Traditional data query algorithm uses the relational model, the data stored in the database and then be related to treatment, it is no longer suitable for the data stream whose relationship is not a permanent form. Therefore, how to achieve a quick query data streams with high real-time, high similarity becomes increasingly important.This paper is mainly based on Haar wavelet analysis preprocessing data, then combine the advantages of a sliding window algorithm and greedy algorithm, integrated data flow similarity queries. Firstly, this paper analyzes the related theories of Haar. Under the total error of data in constraints acceptable premise, sampling will reduce the amount of the data stream and complete the data pre-processing. The simulation results show that, using the proposed data preprocessing algorithm amount of data can be compressed 7~11 times when the overall error be stable at around 98%, verifies the validity of such data preprocessing. Secondly, in order to solve conflict between the sliding window is too long which causes some data wait time too long and the sliding window is too short which causes the overall processing time too long, this paper makes some restriction on the step of basic sliding window. The simulation results show that the improved algorithm can optimize the size of the sliding window. Again, we use data which processed by Haar wavelet, sampling and sliding window to construct balance binary tree, then combined with the greedy algorithm to find data. Digital simulation results shows data query time complexity and space complexity was improved. Finally, the data stream similarity search algorithm was actually applied a small current grounding fault section locating system, and achieves the rapid transmission of data, quickly locate the fault zone, stable operation of the system.
Keywords/Search Tags:Data Stream, Wavelet Analysis, Sliding Window, Greedy Algorithm
PDF Full Text Request
Related items