| Box office is the criterion for evaluating the success of a movie.In the process of analyzing the box office prediction model of domestic films,it is possible to obtain the box office forecast value in advance and explore the box office influence factors,which will help improve the economic efficiency of domestic films and create excellent films.Based on the 208 domestic films from 2016 to 2019,this paper selects 23 variables under nine influencing factors to establish a movie box office prediction index system,which focuses on introducing the variable of theme movie.Then the paper uses multiple linear regression,random forest,support vector machine and extreme gradient boosting(XGBoost)to predict box office with the data released before film premieres.Finally,compare the prediction accuracy of the four models to select the optimal model.And interpret these nine influencing factors based on the regression analysis and make targeted suggestions.There are three main conclusions of the empirical results.(1)The main creators(directors and starring actors),brand effect,technical effect(IMAX),film genre(comedy),theme movie,publicity and macro market environment can boost the box office of domestic movies.The release date and size of the distribution company are no longer the main influencing factors(2)It is feasible and effective to incorporate the variable of theme movie into the box office forecast index system.(3)XGBoost and random forest algorithms are more suitable for building box office prediction models than linear regression and support vector machines.The optimal model is the XGBoost model,and the prediction accuracy of its test set reaches 85.1%,which has certain reference value for pre-release box office prediction in real life. |