Font Size: a A A

Rice Origin Verification Platform Based On Parallel Random Forest Algorithms

Posted on:2020-09-17Degree:MasterType:Thesis
Country:ChinaCandidate:H CuiFull Text:PDF
GTID:2428330599962861Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,the research on the authenticity identification of the rice geographical origin is mainly focused on two aspects: the rice traceability system technology based on process traceability and the rice origin confirmation technology based on source identification.In the origin traceability system,if enterprises provide the basic data of the whole traceability system,human factors may interfere with the traceability results,thus resulting in the phenomenon of "true barcode but false origin ".In contrast,although the rice origin confirmation technology based on the machine learning algorithm has achieved fairly good identification result,the traditional machine learning algorithm is characterized by fairly high time complexity and fairly low efficiency of the origin confirmation model classification in time of processing the large-scale data.So,this paper discusses the parallelized processing of the machine learning algorithm to achieve quick analysis and modeling of the large-scale data,and constructs the rice origin confirmation platform based on the parallelized origin confirmation model to achieve efficient,accurate and convenient confirmation of the rice origin and further improves the traceability system of the rice with geographical indication.This paper collects 433 rice samples from the four major rice producing areas of Meihe,Liuhe,Huinan and Yanbian.In this study,the mineral element content data are pretreated as the basic data for modeling.Using Hadoop distributed cluster technology,the random forest algorithm model,support vector machine model and artificial neural network model based on MapReduce parallel algorithm framework are constructed respectively.After evaluating and comparing the three models,the parallel random forest model with the best classification effect is finally achieved.Forestry algorithm model is the core development of geographical indication rice origin confirmation platform.The main contents of this paper are as follows:(1)Among the three parallelized origin confirmation models based on MapReduce,the accuracy rates of the parallelized support vector machine and the artificial neural network are 93.56% and 87.63% respectively,while the parallelized random forest model has better classification result with a model accuracy rate of 97.55%.(2)When the data numbers are 214,314 and 433,the model accuracy rates are 97.55%,97.85% and 98.32% respectively.The results show that,for the parallelized random forest algorithm model constructed in this paper,the accuracy rate increases with the increase of the size of the data set,and the model meets the basic requirements.(3)This paper designs and implements the rice source confirmation platform,with the browser/server(B/S)architecture and the SSH development frame and with Java as the development language.And the platform mainly implements the functional modules of the user login,source confirmation and result demonstration.In the model,government inspectors can verify their identity by login module,input mineral element content information in the provenance confirmation module,and inspect the origin confirmation result in the result-displaying module.(4)The result of the platform performance evaluation shows that the parallelization-based random forest algorithm model has better acceleration ratio than the traditional serial random forest algorithm,and shows higher performance advantage in time of processing large-scale data.
Keywords/Search Tags:MapReduce, parallelization, random forest, artificial neural network, support vector machine, speedup ratio
PDF Full Text Request
Related items