Font Size: a A A

Design And Implementation Of Stream Data Platform Supporting Data Validity

Posted on:2024-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:R S TangFull Text:PDF
GTID:2568306944467594Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,stream data management has become a core requirement for big data processing.Stream data refers to a sequence of data that arrives in a continuous,rapid,and large-scale manner.Sensor data collected by various intelligent terminals and gateways are all stream data.The complexity of effective management is:In the difficulty of monitoring business scenarios for stream data collected by terminals and gateways,as the data received by data management platforms typically contain a large of noise.Traditional big data platforms are not suitable for monitoring stream data.Based on the above background,this paper designs and implements a stream data management platform that supports data validity management.The validity of stream data is reflected in format validity,source behavior validity,and value validity.This system ensures data validity while implementing data access and distribution functions,integrating and aggregates stream data,and distributing to other big data analysis platforms.Considering for big stream data and the low learning efficiency of traditional data cleaning algorithms,this paper proposes a data repair method based on incremental generative adversarial networks(Continuous Learning Generative Adversarial Networks,CL-GAN).This algorithm continuously learns data features through the generator and discriminator of GAN network,and CL-GAN network’s replay is embedded in the feature extraction layer,reducing computational complexity and increasing iteration efficiency.After experimental verification,the CL-GAN method is superior to baseline methods such as MemRe-GAN,EWC,and replay GAN in terms of model performance.Based on the analysis of the background of stream data management platform,the current status of related products and research in the industry were investigated and analyzed,the CL-GAN algorithm proposed in this paper and its experimental results were introduced in detail.The overall architecture and sub-module design of the system were explained,followed by the description of system implementation.Finally,after a series of simulations,the effectiveness of the platform was verified.
Keywords/Search Tags:Data Repair, Neural Network, Data Validity, GAN
PDF Full Text Request
Related items