Font Size: a A A

Design And Implementation Of Data Analysis And Modeling Platform Based On R Language Analysis

Posted on:2020-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:J Q WangFull Text:PDF
GTID:2428330572483983Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the deep integration of computer technology,Internet of Things,cloud computing and other technologies,the government has decided to strongly recommend the development of industry towards digitalization and intellectualization.Internet of Things technology enables more data to be collected.Cloud computing technology helps to process and analyze collected data more effciently.In the whole process of industrial data analysis,how to import historical data from various data sources into the analysis system and in which way the data of various structures are analyzed in the analysis system become the basis of the whole data analysis.How to better summarize and promote the process of data processing,analysis and presentation to achieve a higher utilization rate is the key to the development of data analysis.This requires the use of data analysis modeling tools to deal with these huge data.R language is an open source language for data statistics with comprehensive algorithms and easy to learn and use.It allows data analysts and data miners to focus more on the algorithms themselves than on the cumbersome syntax of a programming language.At the same time,R language is open source,so it has good community support.Using R language for data analysis not only has a low threshold,but also is convenient for the expansion and use of the analysis algorithm.By building a data analysis modeling platform based on R language,this paper hopes to find reusable data analysis models suitable for enterprises in various industrial manufacturing fields.By building a data analysis modeling platform based on R language,this paper hopes to find reusable data analysis models suitable for enterprises in various industrial manufacturing fields,helping to analyze existing historical data and providing support for enterprise decision-making.The works in this dissertation are mainly focused on the following aspects:Firstly,the process of data analysis is abstracted,and the algorithm and other data called in the analysis process are transformed into business data and relevant data tables are established.With the storage capacity and computing capacity of the Hadoop cluster,Java language-based SSH framework design and implement an analysis modeling platform which includes interfaces and services.Secondly,it obtains the method of calling R language service through Java language,meanwhile it realizes the function of using Java language to process business logic and using R language to complete analysis and display of data.When R language is used for analysis,different data sources need to be loaded into it.This paper introduces the method of data import by using different R packages1Thirdly,for the convenience of users,the system provides some mature and common data analysis algorithms,such as:correlation calculation,bayesian algorithm,association rule mining algorithm,decision tree,neural network and clustering algorithm.For users with some development experience,the platform allows them to write and upload their own algorithms.This system not only provides the good interactivity by optimizing the front and rear end,but also offers the expansibility of algorithm function.Moreover,It can not only meet the needs of enterprise managers,but also provide more abundant and free algorithm function expansion for talents with development ability.With the capability of data analysis,presentation,storage and computing provided by the platform,the cost and consumption of enterprises can be saved.
Keywords/Search Tags:Big Data platform, Hadoop, HDFS, Spark, R
PDF Full Text Request
Related items