Air Quality Prediction Based On NGBoost

Posted on:2022-03-03

Degree:Master

Type:Thesis

Country:China

Candidate:S H Wang

Full Text:PDF

GTID:2491306536496724

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the growing size of cities and the improvement of people’s quality of life,the discharge of various pollutants is also increasing gradually,many problems of environmental pollution have arisen,air quality has gradually become a topic of concern,Accurate prediction of air quality index(AQI)is a key prerequisite for solving air pollution problems.However,the non-linear variation of the AQI depends on a number of factors,in the previous research on air quality prediction,redundant features are usually not dealt with,and the impact of data on the prediction model is rarely considered.This article starts from the data itself,consider the correlation and redundancy of features,the main work of the air quality prediction model is as follows.Firstly,according to the longitude and latitude of the monitoring station,meteorological data and air quality data are stitched together to form a complete data set.Based on the correlation and redundancy of all the features,relatedness and redundancy were combined to form a new feature screening index,an algorithm of feature extraction based on embedded redundancy is proposed.Secondly,in view of the uncertainty of AQI prediction,we combine the method of data classification and NGBOOST to improve NGBOOST,and propose a data classification method based on AQI distribution map.This method builds NGBOOST respectively from all kinds of labeled data,and then makes prediction through the NGBOOST which belongs to it,and summarizes the results.Thirdly,combined with Spark,the parallel air quality prediction model is established and the parallel algorithm of the air quality prediction model is designed.As NGBoost models are independent of each other,they can be used for parallel computation on distributed nodes.Furthermore,the decision tree is the base model of NGBoost,which further improves the parallelism,alleviates the problem that NGBoost’s generalization ability is weakened in the case of large amount of data,and improves the computational efficiency.Finally,experiments are carried out on pseudo-distributed nodes to verify the effectiveness of the proposed model and algorithm,and the comparative experiments and results are analyzed with other algorithms.

Keywords/Search Tags:

NGBoost, Remove redundancy, Spark, Parallel computing

PDF Full Text Request

Related items

1	Research And Implementation On Parallel Computing Technology Of Casting Processes Numerical Simulation
2	Design Of 3SPS+C Parallel Mill With Drive Redundancy
3	Chemical Computing Application And It's Realization Method Of Large-scale Parallel Computation On PC Cluster-based System With MPICH Technology
4	Development Of Numerical Modeling For Gas-Liquid Two-Phase Flows Based On Unstructured Grids And Parallel Computing
5	Implementation Of Parallel Computing Method For Numerical Simulation Of Additive Manufacturing Process
6	CPU+GPU Heterogeneous Parallel Computing Of Dendritic Growth Of Steel
7	The Research Of Parallel Computing And Visualization Of Surface Dust Evolution
8	The Study For Parallel Computing About The Growth Mechanism Of Ice Crystals In The Process Of Freeze Concentration Of Liquid Food Base On The Phase-Field Model
9	Research On Prediction Model Of Superheat In Aluminum Electrolysis Based On Multi-granularity
10	Study Of Parallel Computing In Groundwater Contaminant Transport Simulation