Font Size: a A A

Research On The Technology Of Label Cube

Posted on:2013-05-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:P X BanFull Text:PDF
GTID:1228330392455399Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Online analytical processing technology and the skyline query technology are twoimportant technologies used to analyze massive data to support decision-making. Datasecurity and fast response are two important requirements for them. Cube technology is oneof the methods to achieve fast response. A cube is a collection of all possible results of certaintypes of queries. A prior precomputed and materialized cube can achieve real-time responseto a query without having to calculate. Security database is a way to solve the securityproblems. It uses label to identify the user or data sensitivity and has a perfect safetyprotection mechanism. Therefore, it is of great significance to study the issues of cubesecurity, as well as the key issues such as calculation, update, and storage under label securitydatabase environment.A label skyline cube concept is proposed for the problem of quick response to theskyline query. In label security database, all users and tuples are marked with security labels.A user can only read a tuple when his/her label dominants the tuple’s label. This makes the setof the data, i.e. the processing objects available to the user different with different labels, andso it is with the processing results. This is a significant difference from the conventionalenvironment. Label skyline cube is a collection of all skyline query results of users withdifferent labels, and can respond directly to the queries of users with different labels. Inaddition to inspect data from a multidimensional perspective, there is still a requirement toinspect data from multi-user point of view in label security database, since the data cubes ofdifferent users are also different. To this end, the label data cube, which is the collection of alldata cubes corresponding to users with different labels, is proposed. Based on the labelskyline cube concept and label data cube concept, a label cube model is further proposed. Onthe one hand it can generate cube protected by the security database, on the other hand it canestablish the security label mapping between tuples of source table and cube. So it canreasonably reflect the security settings of the source table in the cube, without disclosing theconfidential information of source table.In order to improve the computational efficiency of the label skyline cube, an efficientalgorithm called SLSC is proposed. The existing algorithms used to calculate the skyline cubedid not consider the label features, and the efficiency is unsatisfactory if we use thosealgorithms designed for single skyline to calculate each skyline independently for the wholecube. The SLSC algorithm is based on the shared strategy. It uses containment relationship of point set derived from label domination, and shares calculations by iteration. The theoreticalanalysis and experimental results show that the algorithm improves the efficiency, especiallywhen the number of label levels increases.In order to improve the update efficiency of label skyline cube, BUI and BUDincremental update algorithms are proposed. Label skyline cube contains a large amount ofdata, so the update is complicated. It is costly to recalculate the entire cube when a smallamount of source data is changed. Based on the characteristics of the label skyline cube, BUIand BUD algorithms use the bottom-up way to traverse the affected labels, and incrementallyupdate the label skyline cube based on existing cube. The theoretical analysis andexperimental results show that the algorithms significantly improve update efficiency.Efficient storage is a critical issue of the label data cube. In order to effectively reducethe storage size of the label data cube, a compact storage scheme is proposed. As real life datais often sparse, with the containment relationship of point sets derived from dominationrelationship of labels, it will lead to data redundancy between data in different label cubics. Inparticular, there may be multiple identical tuples with only difference in labels, and theirlabels form a domination chain. In this case, compact storage scheme retains only one tupleand delete the rest, so as to reduce the label cube storage size. Meanwhile, due to the rationaldesign of table about deletion and compact processing sequence, the recovery is easy andsecure.
Keywords/Search Tags:online analytical proccessing, data cube, skyline query, security database, labelcube
PDF Full Text Request
Related items