The trade of agricultural products is the main source of income for my country’s farmers’ economy and an important part of the living expenses of my country’s residents.Unstable agricultural prices will not only affect farmers’ income and hinder agricultural development,but also adversely affect the daily life of urban and rural residents.If the prices and changes of agricultural products can be predicted in a timely and effective manner,early prevention can be achieved,losses can be reduced,the healthy and orderly development of the agricultural industry can be guaranteed,the life of urban and rural residents can be guaranteed,and the national economy can be stabilized.With the rapid development of information technology,agricultural product websites have appeared one after another,and agricultural product price information has also exploded,but these data have not been fully utilized.This thesis studies the current typical big data processing platforms Hadoop and Spark frameworks,and uses the comprehensive functional components of data storage,analysis and application provided by them to build a price forecasting platform for agricultural products.In the aspect of price forecasting,it mainly explores and researches commonly used time series forecasting methods.The typical algorithms are exponential smoothing method and ARIMA model,and explores the process of implementing the algorithm in Spark framework,so as to establish exponential smoothing method and ARIMA model and conduct experimental tests.Starting from functional requirements,based on business process,development technology and data process,this paper designs the platform in an all-round way,and defines the three core functional modules of data center,price forecast and price analysis.The price data of agricultural products is collected by the online and offline dual-channel method,the data on the Internet is obtained by the web crawler,and the offline data is collected by the uploading and reporting interface.Guided by ETL technology,Spark framework is the development foundation to realize data governance.Use Hive components to build a data warehouse to store and analyze the prices of agricultural products,and provide data services to the upper layer.The Spark framework is used as the computing engine,the price prediction algorithm is implemented with Scala language coding,the model is evaluated with the error evaluation method,and it is integrated into the platform in the form of Web API to provide application services like other functions.The work of this thesis has great value for both theoretical research and application.In particular,the proposed Spark agri-product price forecasting cloud platform takes full advantage of the data processing ability of big data platform,improves the potential use value of data,and has important practical significance to promote the application of agriproduct price forecasting. |