Font Size: a A A

Research On Adaptive Processing And Saturation Mechanism Of Distributed RDF Data Stream

Posted on:2022-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y L HanFull Text:PDF
GTID:2518306317477704Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At present,the demand for real-time processing of "big data" has promoted the development of distributed stream processing frameworks in clusters.In order to provide fault tolerance and large-scale efficient stream processing,current stream processing frameworks have proposed to process stream workloads as batch processing jobs on a series of small batches of stream data.In the distributed RDF streaming data processing solution,variable operating conditions such as data ingest rate and workload characteristics have a great impact on the stability of the streaming data system,but the proposed solution does not stabilize the variable operating conditions Sex is well explored.At the same time,in the process of continuous generation of streaming data,although there are solutions to use big data platforms to reason about RDF streaming data,in the environment of dynamic RDF data,solutions to data saturation operations are still relatively lacking.For this reason,this thesis is devoted to researching the adaptive processing method of RDF stream data in distributed environment and the saturation mechanism in the reasoning process.The main work of this thesis is as follows:(1)Aiming at the problem of how to maintain stability of variable operating conditions in the stream processing framework,this article first explores the impact of batch interval on streaming workloads,and based on this,determines the goal of dynamic batch adjustment;then,in the dynamic batch adjustment On top of the system model,a dynamic feedback processing algorithm is proposed,which can automatically adjust the batch interval according to the situation,so that the system can perform adaptive stream processing for variable conditions.(2)In the process of using the big data platform to reason about the RDF data stream,this article will focus on the saturation operation.Specifically,this article relies on Spark clusters to saturate large RDF data streams,thereby inferring the implicit RDF triples of a given RDF schema constraint,and identifying existing and saturated RDF data set fragments,these fragments Considering the new RDF sentence delivered by the stream,the incremental stream saturation algorithm designed in this thesis ensures the integrity of the saturation process.In addition,this thesis uses indexing technology in the process of stream saturation to improve the efficiency of incremental saturation.Experimental analysis shows that the dynamic feedback method in this thesis has a better feedback adjustment effect and strong stability to variable operating conditions.At the same time,the incremental saturation scheme also has higher efficiency,which is better than the existing method.
Keywords/Search Tags:RDF Stream, Dynamic Feedback, Adaptive Stream Processing, Incremental Stream Saturation
PDF Full Text Request
Related items