Font Size: a A A

Scalability of commercial database management systems as RDF stores

Posted on:2013-07-11Degree:M.SType:Thesis
University:Purdue UniversityCandidate:Atal, Nikita ShyamsunderFull Text:PDF
GTID:2458390008474025Subject:Information Technology
Abstract/Summary:
With the data on the Web growing exponentially, the World Wide Web Consortium has proposed semantic storage of data in order to maintain common formats for data integration. Researchers have extended the advantage of storing data semantically to life sciences data. Finding the right database to store semantic data depends on various factors such as amount of data involved, information to be extracted from the data stored, functionality to be addressed by the database, available expertise and infrastructure, etc. The author has compared three RDF stores—Oracle 11g, Virtuoso and TDB—for storage and retrieval of RDF data, using basic commodity hardware. The data used was cancer proteomics data, generated by a mass spectrometry instrument. Performance comparison was solely on 3 factors—data loading time, query response time and query throughput. For all data sets, TDB gave the lowest data loading times, followed by Virtuoso. Oracle gave the highest data loading times. For most of the queries, Virtuoso performed best for smaller datasets. For bigger datasets, TDB performed better in most cases and was closely followed by Virtuoso. For all queries, Oracle gave the longest query response times. Thus, combining data loading times and query response times, TDB performed the best, closely followed by Virtuoso. Oracle showed the worst performance amongst all three databases.
Keywords/Search Tags:TDB performed, Database, Data loading times, Closely followed, Virtuoso, Information, Query response times, Oracle
Related items