Font Size: a A A

Research On Dynamic Drilling Sampling Method For Large-scale Streaming Data

Posted on:2024-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhangFull Text:PDF
GTID:2568307076992909Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Streaming data is characterized by real-time and high-speed changes,making it challenging to comprehensively capture its feature distribution.Sampling,as the core method of big data analysis,helps to simplify data sets,reduce calculation costs,avoid overfitting,and improve data interpretation.However,existing streaming data sampling methods often cannot retain a large amount of discrete value information and value,making it difficult to comprehensively collect the feature values of the original streaming dataset.In addition,if the sample set contains a large number of discrete values,its accuracy in evaluating the feature distribution of the original streaming data will be reduced.This study proposes the following main contributions to address these issues:Firstly,in response to the shortcomings of existing streaning data sampling methods in capturing discrete value information and value,as well as their inability to comprehensively collect streaming data feature values,this study proposes a dynamic streaming data drilling sampling method(SDDS)based on the limited access streaming data drilling sampling method(SDLSA).This method takes the "well" as the analysis unit,dynamically adjusts its size and position,and accurately predicts the position and range of discrete values.At the same time,the sampling value evaluation model(SVEM)is introduced to comprehensively evaluate the effectiveness of SDDS sampling methods from three perspectives: sparse,dense,and holistic.The experimental results show that the sample set obtained through the SDDS sampling method has an evaluation accuracy of over 90% in terms of sparsity and density of SVEM,which is superior to the SDLSA sampling method.However,when the sample set contains a large number of discrete values,the accuracy of evaluating the overall feature distribution of the original streaming dataset is relatively low.Secondly,to address the issue of low accuracy in evaluating the overall feature distribution of raw streaming data when the sample set contains a large number of discrete values,this study proposes a streaming data adaptive drilling sampling method(SDADS).This method is based on the SDDS sampling method,which adaptively adjusts various sampling rates in the well.During the sampling process,all data in the current well is cached,and then the data in the well is adaptively resampled to ensure consistency in data distribution before and after sampling.At the same time,the sampling overall value evaluation model(SOVM)is introduced to verify the effectiveness of the SDADS algorithm.The experimental results show that the accuracy of the sample set obtained by the SDADS sampling method in SOVM evaluation is significantly better than that of the SDDS sampling method,with an accuracy improvement of about 10%.Finally,this study designed and implemented a dynamic sampling and evaluation system for streaming data.The system is based on the aforementioned sampling methods and evaluation models,achieving real-time dynamic sampling and evaluation of largescale streaming data.The main functions of this system include dynamic sampling parameter configuration,real-time display of dynamic sampling data,evaluation of dynamic sampling value,and sample library management.The above functions can clearly display the real-time dynamic sampling and value evaluation results of large-scale streaming data,helping users understand the feature distribution of large-scale streaming data.In general,this study proposes a dynamic drilling sampling method and evaluation model for large-scale streaming data,as well as an adaptive sampling method and evaluation model for large-scale model streaming data,to address the problem of easy loss of a large number of valuable discrete values in streaming data sampling methods and the difficulty of fully representing the feature distribution of the original streaming data in the sample set.In addition,a dynamic sampling and evaluation system for streaming data was designed and implemented,and the effectiveness of the streaming data sampling method and evaluation model proposed in this study was verified through simulation.The methods and models proposed in this study have important theoretical and practical value in the field of big data analysis.
Keywords/Search Tags:Streaming data, Characteristic distribution, Dynamic drilling, Sampling, Evaluation
PDF Full Text Request
Related items