Font Size: a A A

Storing Large Scale Semi-structured And Unstructured Data On RDBMS

Posted on:2014-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2268330422964766Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Along with the strong development of information technology, a variety ofnon-relational data continue to emerge. These new data has new demands for the databasestorage, on the one hand, the traditional relational model has not fully qualified it. On theother hand, the new database is uneven, yet to be developed. But the traditional relationaldatabase was mature and it had widely range of user groups. So companies need to find away to store non-relational data in a reasonable manner.By analyzing the data characteristics of the non-relational KEY/VALUE data, and theprinciple of column-oriented database as well as the traditional relational databaseline-oriented storage, our system will have the non-relational data stored in a relationaldatabase in a reasonable manner. By extracting the information of the KEY/VALUE data,we split the VALUE data, will creating a table for every property. We extracting anddismantling the information of the SQLs and the system will become transparent for theusers. But the VALUE would have so much string type attributes, it will be better toreplace such string to integer and just stored the integer into the database. There are twoadvantages, first we can remove duplicate strings, and compression the data space; secondwe can also speed up the processing speed of the string attribute columns in the database.We replace those data and store it in an external file, by creating a dictionary and an indexfile, we can making a fast replacement processing.Different sets of data had imported into the storage system to have a functional testand a performance test. The test results show that the system can correctly replace theinput information; also the query of converted data was faster than the original data.
Keywords/Search Tags:Relational Database, Non-relational data, Column-Oriented Database, Dictionary, Data Storage Model, Index
PDF Full Text Request
Related items