With the steady growth of air travel demand,the number of flights continues to increase,and the air traffic is becoming more and more busy,which leads to frequent flight delays,which has a large negative impact on airline operations and airport scheduling,and also declines passenger satisfaction.Therefore,the establishment of an accurate and effective flight delay prediction model can effectively reduce the adverse effects of delays and provide effective countermeasures for airports and users,which is of great significance to aviation order and user travel.Aiming at the problem of flight delays,this thesis uses Python crawler technology to obtain weather information.After a single flight data is added the weather data after feature scores,a 128-dimensional initial feature that affects flight delays is constructed.Considering the influence of feature relevance and importance on the accuracy of the model,this thesis establishes the FCG_XGBoost feature selection method based on feature relevance grouping,selects 39-dimensional salient features,and designs comparative experiments to verify the effectiveness of the FCG_XGBoost algorithm.Secondly,based on 39-dimensional features,6types of flight delay prediction models such as Light GBM were constructed,and two-class evaluation indicators were used to visualize the experimental results in multiple dimensions.At the same time,in order to improve the accuracy of model prediction,the stacking strategy is used to construct a fusion model.The experimental results show that the prediction effect of the fusion model is better than that of the single model.In order to further improve the accuracy of the fusion model,this thesis comprehensively considers the accuracy of the base model and the fusion strategy.Aiming at the problem of low accuracy of the base model,this thesis uses Bayesian optimization algorithm to optimize the hyperparameters of integrated learning models such as Light GBM,and uses random search and grid search for comparative experiments.The results show that the model after Bayesian optimization is better.Aiming at the problem of fusion strategy,this thesis establishes an improved stacking strategy based on the traditional stacking strategy,and builds the final weighted fusion model.Its AUC value reached 0.97,and the accuracy rate was 96.6%.Compared with the better XGBoost model among the six single models,the accuracy rate increased by nearly 4%,and the accuracy rate of delayed flight prediction increased by 3.2%.The experimental results show that the fusion model based on the improved Stacking strategy is suitable for solving the problem of flight delay prediction.This thesis comprehensively considered multiple influencing factors,established a flight delay prediction model,and verified the feasibility of the model through actual data.At the same time,this thesis implemented a visualization system for flight delay prediction models,and visually showed the prediction results.In summary,the flight delay prediction model studied in this thesis has certain practical significance in the application of flight delay prediction. |