Font Size: a A A

Research On Big Data Cleaning Algorithm Based On Edge Computing In Industrial Internet Of Things

Posted on:2022-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:M ChenFull Text:PDF
GTID:2518306494979889Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
The Internet of Things(IoT)is a gradually emerging technology that realizes the intelligence of many industries,so that limited resources can be more reasonably allocated and used,thereby improving the efficiency and benefits of the industry,such as smart industry,smart transportation,Smart community and remote monitoring,etc.The Industrial Internet of Things(IIoT),as the largest and most important part of the Internet of Things technology,acts as a bridge for communication between information and entities in intelligent manufacturing.Industrial boiler system realizes resource scheduling and allocation through real-time monitoring and control of physical devices at the perception layer to reduce resource consumption and improve manufacturing efficiency.In the Industrial boiler system,a large number of digital devices are connected,including sensors,control equipment,manufacturing machines,and terminal equipment,which will generate a large amount of data.Due to the harsh sensor environment in the industry,the collected big data is not credible,which seriously affects the judgment and feedback of the cloud.Data cleaning methods that rely on sensor nodes are not enough to process big data.Combining Mobile Edge Computing(MEC)technology and data cleaning methods can just solve the above problems.MEC transfers some work of sensor nodes,aggregation nodes and cloud services to mobile edge servers.In addition to reducing the workload of cloud servers to improve cloud performance,it can also reduce data transmission delays.This is because edge servers are closer Industrial perception layer equipment.So the main research in this article is based on the MEC-based industrial IoT big data cleaning solution.First of all,because the amount of data in the Industrial boiler system is so large that it needs to consume a lot of resources for calculation and analysis,and the dimensionality of these data is too large,this makes analysts fall into the dilemma of "dimensional disaster",and it is difficult to carry out in practice.Effective analysis.Therefore,this paper proposes a Kernel Principal Component Analysis(KPCA)method to select some valuable data and embed high-dimensional mixed data into continuous space.It is constructed into a continuous,dimension-reduced physical data to match the anomaly detection algorithm in the low-dimensional continuous domain.Secondly,based on the premise of data dimensionality reduction,this paper proposes an Isolation Forest(i Forest)outlier detection algorithm.In industrial data,it is inevitable that abnormal information such as noise,data errors,and data loss will be generated.Abnormal data affect the results of the analysis.In order to increase the processing speed of the server to meet the space and time requirements in the industrial field,algorithms with low space complexity and time complexity are required.According to the above situation,an outlier detection algorithm based on overall data is proposed.By comparing with the K-Nearest Neighbor(KNN)algorithm and the Support Vector Machine(SVM)algorithm,it is proved that the isolation forest algorithm has higher accuracy and efficiency in data cleaning in the Industrial boiler system.Finally,the software Jupyter notebook is used to simulate the i Forest algorithm proposed in this paper.The simulation results show that the KPCA algorithm can reduce the dimensionality of industrial data and select effective attributes.The i Forest anomaly detection algorithm based on the KPCA method not only has high accuracy and high detection rate,but also the execution efficiency has been greatly improved.
Keywords/Search Tags:Industrial Internet of Things, edge computing, data dimensionality reduction, outlier detection
PDF Full Text Request
Related items