In recent years,the analysis of big data in sports has emerged as an essential focus within the field of sports science.Concurrently,data mining algorithms based on historical data have become crucial tools for sports event research.As sports entail competitive contests,the outcomes of these competitions hold paramount significance.Within NBA events,accurately predicting match outcomes contributes to effective team and player management,as well as the selection of strategic tactics.Historically,research in the field of sports has been limited to using traditional statistical methods or machine learning algorithms for simple match outcome predictions,with minimal application of ensemble algorithms.Additionally,there has been a lack of multidimensional correlation analysis between match outcome prediction models and technical indicators.Consequently,to enhance the performance.interpretability,and practical application value of prediction models,and to thoroughly explore the technical indicators embedded within NBA match data from various perspectives,this paper employs the Stacking model fusion approach in the sports domain.This approach is combined with the LIME(Local Interpretable Model-agnostic Explanations)and SHAP(Shapley Additive Explanations)interpretation methods to construct interpretable machine learning models.The work and achievements of this paper are outlined as follows:(1)Data cleaning and preprocessing:The NBA regular season dataset from 2017 to 2020 was collected from the official authoritative website using Python web scraping techniques.Considering the characteristics of basketball datasets,individual player performance data is transformed into team-level performance data,and relevant feature engineering tasks are conducted.(2)Building the game result prediction model:The dataset was divided into a training set and a test set.Cross-validation and grid search were used for hyperparameter tuning to construct single models.After comparing the predictive performance of different models,Support Vector Machine and XGBoost were selected as base classifiers,with the logistic regression model as the meta-classifier for the Stacking model construction.Experimental results demonstrated the superiority of the Stacking model in predicting NBA game outcomes on the dataset.The accuracy on the test set was improved by approximately 0.2%to 2%compared to the individual models.(3)Exploring the complex relationship between feature indicators and match outcomes from multiple perspectives and integrating an interpretive framework to enhance the interpretability of the prediction model,:Firstly,by integrating the LIME method,the model’s prediction decisions are explained from a local interpretive perspective,gaining insights into the specific match scenarios and achieving practical application value of the prediction model.Secondly,by incorporating the SHAP method,the overall impact of various technical indicator features on the model’s predictions,as well as the individual and interaction effects of defensive rebounds,are analyzed from a global interpretive perspective. |