Font Size: a A A

Application Research Of Data Registration Intermediate Database Based On Agricultural Big Data

Posted on:2022-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:R HeFull Text:PDF
GTID:2518306473494384Subject:Agricultural engineering and information technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and continuous breakthroughs in big data technology,computing and storage capabilities have been greatly improved.Big data involves a wide range and a large amount of data,which makes data management difficult and time-consuming.In order to solve these problems,the Chinese government has promulgated a series of related favorable policies,such as the "Big Data Development Action Plan".Among them,the policies related to agricultural big data include "Agricultural and Rural Big Data Development","Rural Revitalization",etc.At present,for the management of agricultural big data,different platforms use different storage rules,and manual data processing methods still exist,which makes data collection difficult and inefficient management.Agricultural big data not only has the characteristics of big data scale,high speed,diversity,value and authenticity,but also faces the problems of difficult data collection and long span period,which makes the management of agricultural big data more difficult and more time-consuming.Therefore,how to improve the management efficiency of agricultural big data has become an urgent problem to be solved.The research of this paper is based on agricultural big data,using distributed storage architecture and data-oriented architecture DOA(Data-Oriented Architecture)the idea of data registration center to establish data registration intermediate database,and design data registration system customers end.According to the characteristics of agricultural big data,unified registration rules are formulated for unstructured data.TF-IDF(term frequency –inverse document frequency)algorithm,naive Bayes algorithm and k-nearest neighbor algorithm are used to classify text data;For structured data,"data registration principle,one database and one standard" is formulated.TF-IDF algorithm is used instead of manual registration method to automatically filter many data attribute fields and select top-N as data registration field,so as to improve the efficiency of data registration and realize the efficient management of agricultural big data.The main innovations of this paper are as follows:(1)This paper proposes an unstructured and structured registration rule for agricultural big data.The unstructured data registration rule of agricultural big data is "unified registration rule".The registration structure template is divided into four modules,which are the original data content,mapping relationship,registration information and permission management module;The registration rule of structured data is "unified registration principle,one database and one standard".Among them,the "unified registration principle" has eight principles,and the "one database and one standard" is based on the structure template of structured data registry and the actual database table structure.(2)This paper proposes a method of using algorithm to register unstructured and structured agricultural big data.For unstructured data registration,take text data as an example.Firstly,preprocess the text content,use Jieba library to segment words,remove useless words,build a dictionary,use one hot to code,use TF-IDF algorithm to calculate TF and IDF,map each word into a vector through word2 vec,use naive Bayes and k-nearest neighbor algorithm to classify the text,get the classification results,and register unstructured data;For structured data,TF-IDF algorithm is used to calculate TF and IDF values,and top-N field is selected as mobile field.Combined with fixed field,structured data is registered to improve the registration efficiency of data information.Through the test and verification of the agricultural big data registration system,the use of algorithmic registration data can greatly improve the registration efficiency.In the test of registering unstructured data,the results show that the registration time without algorithm is more than 720 times that of the Naive Bayes algorithm and more than 750 times that of the KNN algorithm.The classification accuracy of the KNN algorithm is higher than that of the naive Bayes algorithm,and it can better achieve the goal of database subject classification.When registering structured data,the registration time without using the algorithm is more than30 times that of using the TF-IDF algorithm.
Keywords/Search Tags:Agricultural Big Data, Data Registration, Registration Rules, Big Data Management
PDF Full Text Request
Related items