Font Size: a A A

The Key Technology Research Of Cloud Data Processing Based On Uncertainty Theory

Posted on:2017-01-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Y GuoFull Text:PDF
GTID:1108330485450026Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In January 2016, the RightScale has investigated more than 1,000 enterprise users for the use condition of Public Cloud, Privacy Cloud and Hybrid Cloud. Results of the investigation report show that 95% respondents are using Clouds. In the real world, uncertain factors are widespread in various phenomena. Issues of migration, scheduling, etc. of the data and virtual machine in the Cloud Data Center are of uncertainty under the cloud computing environment.For the processing of uncertain data, many achievements have been achieved at present, which mainly focuses on the uncertainty of entity data without covering certain actual issues in the reality to an enough extent.For the processing of the uncertainty of the relations among entities, literatures have applied the random and fuzzy theories to solve the processing issue of the nearest neighbor query. The relations among entities, however, also reflect the subjective uncertainty, which is neither random nor fuzzy.In reality, the historical data cannot be obtained for many issues; then the probability theory cannot be applied to obtain the frequency of event occurrence. At this time, the reliability of event occurrence must be assessed based on the expertise. Hence, the variance of reliability will be far greater than the frequency. In order to address the subjective uncertainty of cloud data, moreover, the uncertainty theory will be adopted to conduct researches on the processing technology of cloud data.This paper aims to research key technologies of query processing and optimization of the cloud data and abstractly model the Cloud Data Center into an uncertainty map by learning from and absorbing relevant researches of the uncertainty theory. Due to reasons of heterogeneity, privacy protection, data deficiency, data inaccuracy, etc., however, the data of the map is of uncertainty as well. With the path query algorithm for the uncertainty map, the paper is also to conduct depth discussions for the query processing and optimization of cloud data. And the major work and contribution of the paper can be summarized as:(1) The paper presented a security protection framework for the cloud data, which mainly includes hierarchy modules of the physical security, security of the virtual network, security of the cloud operating system, security of virtual clusters, data security, SaaS/PaaS/IaaS security, security management, security of the operation and maintenance, etc. Compared with the traditional security, the framework is the same in aspects of security targets, types of system resources and basic security technologies, but has its specific security issues as well, mainly including:the issue of virtualization security and certain security issues related to the subletting service mode of cloud computing. Moreover, the framework possesses the better security and protective capability in aspects of virtualization security, data security, privacy protection, etc.(2) The paper proposed a risk analysis method for the random fault tree of uncertainty based on the security protection framework of cloud data. The method, based on the uncertainty and opportunity theories, proposes a way to construct and analyze the fault tree, which is composed of the logical relations among bottom events. If the fault rate of bottom events is obtained from the historical data, for example, it will be expressed as the random variable; if there is no statistical data, which, however, can be obtained based on the subjective judgments of experts, it will be expressed as the uncertain variable then. Besides, the opportunity of event occurrence is the random variable of uncertainty. And a hybrid simulation algorithm is constructed to calculate the opportunity of the occurrence of top events. With the analysis method for the random fault tree of uncertainty, moreover, the risk analysis is conducted for the security protection framework of cloud data.(3) A credible nearest-neighbor query method under the uncertain network condition is proposed. And the study proposed algorithms of the calculating method of credible distance (CMCD), calculating of path length reachable (CMFP), calculating of reachable path length desired (CMDLFP) and nearest-neighbor query of k under the credible condition (QMCCK); model the uncertain network into an uncertain weighted graph; define the sample graph of the uncertainty map, indexes of the sample graph, the basic network as well as the path length reachable and the reachable path length desired; provide the credible nearest-neighbor query algorithm under the effective uncertain condition based on the uncertainty theory. And the nearest neighbor query on the uncertain network can be equivalently transferred into the issue of nearest neighbor query on the basic network. The credible nearest-neighbor query algorithm, therefore, can solve the issue of nearest neighbor query under the uncertain network environment from the uncertain perspective.(4) The study proposed a Top-k query calculating method of uncertain data based on the uncertainty theory. It models the tuples concentrated of uncertain data into an uncertain network; equivalently transfers the Top-k query of ordered tuples the uncertain measuring relations of corresponding sides in the sample graph; classify the sample graph based on the ranking positions covered with sides. The algorithm can avoid calculating the uncertain measuring values of ranking for all tuples in the sample graph and improves the efficiency of Top-k query calculating for the uncertain data. Moreover, it equivalently transfers the Top-k query based on the parameterized ranking function in the uncertain data, into the limited query based on the different Top-k values; combines the Spark Map-Reduce programming framework to complete the system implementation.
Keywords/Search Tags:Uncertain theory, Cloud data, Uncertain networks, sample graph index, trusted distance
PDF Full Text Request
Related items