Font Size: a A A

Compatible Study Of Hadoop For Efficient Analyzing And Processing Of Big Data

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:Raza Muhammad UmairFull Text:PDF
GTID:2428330602970933Subject:Big Data Technology
Abstract/Summary:PDF Full Text Request
While the adoption of computer,the data has been given rise and also eventually increased.Now the question was where to save this data.After resolving this,the storage costs expanded.However,due to growing technology these days,the expenses have become decreased.It can be said as big data is the collection of data sets that is large and more multifaceted,using traditional database management utensils it is difficult to process.Its time consuming to process large volume data sets with traditional methods,so Hadoop framework which is faster than older methods will be used.Main objective is processing of data which is incessantly produced,more efficiently and less time consuming and not of storage of data.The data has been classified into three types such as structured data,unstructured data and semi-structured data.To process on these huge data sets,different kind of frameworks are there in Hadoop.In this research we will focus on three different frameworks,Pig,Hive and Impala to efficiently analyze and gives less time consumption on structured datasets.For this purpose,we made a comparison by applying these Hadoop frameworks on two different datasets check the data processing efficiency.We perform similar task on Hive,Pig and Impala to achieve the results.It's demonstrated from the results that Impala is more efficient than Hive and Pig as it takes less time to perform task.IMPALA is one of the principal cloudera module as well as Hadoop compatible query language.
Keywords/Search Tags:Apache Hadoop, Hive, Pig, Impala, Big Data Technology
PDF Full Text Request
Related items