Font Size: a A A

Research On Large Agricultural Data Processing System Based On Hadoop

Posted on:2018-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:J L DuFull Text:PDF
GTID:2348330515460431Subject:Agricultural informatization
Abstract/Summary:PDF Full Text Request
Our country is vast,with complex and diverse ecological types,crop species is rich and various.Therefore,the agricultural data in China is the data variety and in huge volume.However,due to the limitations of traditional agriculture,agricultural data has not been seriously used.With the advance of agricultural modernization and the level of agricultural improve,all kinds of agricultural data has been paid by the use of analysis,and used to guide agricultural production.However,with the technology of the Internet widely used in agriculture,the amount of agricultural data exponentially increasing,the traditional data processing methods have been unable to meet the demand of agricultural data processing.Agriculture has gradually meet the basic characteristics of big data the data,becoming a major agricultural data.Due to the characteristics of agriculture and the fact that agricultural big data has large,multidimensional,dynamic characteristics.How to effectively deal with the agricultural development of large data industry,is a very important problem.The rapid development of big data technology can well solve the problems of agriculture well facing big data.But the most popular big data processing platform,is undoubtedly the Google Corporation Hadoop.Hadoop is an open source,can be run on large scale distributed computing on clusters in this platform.MapReduce model,has been widely used and gradually become synonymous with.big data,is a parallel programming model first proposed by the Google Corporation.It is the core model of.Map function and the Reduce function is the core of the MapReduce model.They are using <key,value> data structure,which will process the complex data.These tasks are distributed to each computer node,and number of complex distributed parallel architecture to handle is large.In this paper,the characteristics of large data are analyzed,and the advantages and disadvantages of the existing large agricultural data processing system are analyzed and improved according to the characteristics of large agricultural data.The large data processing system based on Hadoop platform is designed.In this paper,the classical data mining is briefly introduced,and the parallelization of the corresponding algorithm is analyzed for MapReduce architecture.The CART algorithm is improved for parallelization of MapReduce architecture,and the algorithm is optimized accordingly.Finally,the data is run in the system to verify the feasibility of the system and the advantages of improved algorithm.
Keywords/Search Tags:Agricultural big data, big data, Hadoop, MapReduce
PDF Full Text Request
Related items