Font Size: a A A

Using MapReduce for scalable and distributed processing of scientific XML data

Posted on:2012-02-19Degree:M.SType:Thesis
University:State University of New York at BinghamtonCandidate:Dede, ElifFull Text:PDF
GTID:2468390011964297Subject:Computer Science
Abstract/Summary:
A seamless and intuitive search capability for the vast amount of datasets generated by scientific experiments is critical to ensure effective use of such data by domain specific scientists. There exists a critical need for an easy-to-use and scalable framework, specialized for scientific data. The work in this thesis makes use of the MapReduce model in XML metadata indexing for scientific datasets We present an indexing structure that scales well for large-scale MapReduce processing. We present performance results using two MapReduce implementations, Apache Hadoop and LEMO-MR, to emphasize the flexibility and adaptability of our framework in different MapReduce environments.
Keywords/Search Tags:Mapreduce, Scientific
Related items