Font Size: a A A

Key Set Validation Of Generalized Entity Integrity Over Incomplete Relations

Posted on:2020-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z X ZhangFull Text:PDF
GTID:2428330599956794Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,more and more data has emerged,and the data that computers need to process has grown geometrically.Especially in the days when artificial intelligence is prevalent,all algorithms are inseparable from the support of the underlying data,so the importance of data is evident in the fact that data management and storage also play an increasingly important role.As the main data management software,database provides the most basic and important support for the development of computer information technology.A series of measures have been proposed by a large number of database scholars to provide theoretical support for data storage and management,especially the entity integrity rules proposed by Codd[19].Codd's Entity Integrity rules state that a primary key must exist for each database table.That is,the data on the column with the primary key must ensure uniqueness and non-emptiness,i.e.the data on each primary key must be unique and cannot be null.However,there are often a large number of null values in the actual database.The entity integrity of Codd is not suitable for this case.Therefore,in order to solve this problem,Thalheim has proposed the key set to solve this problem[8].Codd's rule of entity integrity stipulates the existence of a primary key over every database table.That is,uniqueness and absence of null markers are enforced on the columns of the primary key.Key sets stipulate a generalized entity integrity rule that can be achieved on data sets where primary keys do not exist.Indeed,a key set means that different pairs of rows can be distinguished by unique non-null values on potentially different elements of the key set.While primary keys are a core feature of SQL databases,key sets have not been researched much at all.Our thesis mainly focuses the following three aspects.?1?Firstly,we focus on key sets validation in SQL.One of our goals is to motivate the actual use of key sets in database systems.The use of key sets depends at least on the ability to identify those key sets that are meaningful in a given application domain,and to efficiently validate such key sets during the lifetime of the database.For this purpose,we analyze for the first time the performance of validating key sets in SQL experimentally.?2?Secondly,we design an efficient algorithm to speed up the process of key sets validation and present the algorithm by experiments.For the acceleration problem of validation of key sets,we solve the problem by presenting a novel yet efficient algorithm which validates a key set on an incomplete relation.In our experiment,we show our new algorithm performs much better than a brute-force algorithm.Besides,we also use the algorithm to investigate how the cardinality of a key set affects its satisfaction in real world datasets.?3?At the end,we investigate the issue of Armstrong relations for the implication of unary key sets by arbitrary key sets.Then we conduct experiments that provide insight on the time and size required to generate Armstrong relations for the implication of unary key sets by arbitrary key sets.Armstrong relations provide computational support for identifying key sets that are meaningful for a given application domain.
Keywords/Search Tags:Primary Key, Key Set, Armstrong Relations, Incomplete Relations
PDF Full Text Request
Related items