In recent years,with the rapid development of China’s film industry,China’s film market has also begun to become the world’s second largest film market after the United States.However,the China’s film market tends to be saturated and the industry growth slows down.On the other hand,the film industry is prone to hundreds of millions of investments and the uncertain market environment has brought huge risks to the film production.The success or failure of any film may bring huge profits or losses to the company.Therefore,in this study,it is necessary and valuable to construct a box office revenue prediction system for films in the early stage of production.As the first step,this paper mainly studies the Chinese film market.Artificial intelligence is more and more used to improve the operation of various companies and industries.In the film industry,it is mainly used to help film production,investors make investment decisions and the operation and management of cinemas.Accurate box office forecast has a very positive impact on the investment allocation of operation management in the film industry,and can help the management to overcome the challenge of resource allocation.Looking back on the literature related to box office prediction in the past,the problem of box office prediction mainly focuses on two aspects: factor selection and model selection.In the aspect of factor selection,the time node of film prediction is very important for the application of prediction results.According to the time,the box office forecast can be divided into three stages: pre-production prediction,pre-release prediction and post-release prediction.The value of pre-production prediction for operational management decision-making is the greatest.Therefore,we aim at the early prediction of films.The proposed features are all based on the data set of the nature of the movie itself,and do not include any social platform word-of-mouth or post movie data.In the feature processing of these factors,we choose the corresponding processing methods according to the characteristics and properties of different feature factors,and fully consider the static features and dynamic features.In terms of model construction,previous researches on box office prediction mainly focus on four types of models,which are the traditional statistical learning model,probability model,time series model and machine learning model.Based on this,this paper attempts to build a stacking fusion model to obtain better prediction effect.Through the analysis of the model effect and feature contribution degree,combined with the Xgboost,Random forest,Lightgbm,KNN algorithm and stacking fusion theory,a stacking model for film box office prediction is established.The purpose of this paper is to predict the box office level of domestic films in the early stage of film production,and to provide reference information for the investment and operation of film and television enterprises.The empirical results show that the prediction accuracy of the model is good.The accuracy rate of the model(1-Away)is 86.46%,and that of Bingo is69.16%,which is better than all the box office prediction systems constructed by single machine learning model.Among the feature factors used,the star influence involves the most features and has the strongest predictive power,because stars not only have an important impact on film word-of-mouth and publicity,but also have a strong correlation with many potential factors.Other factors,including release data,release area and genre,have similar predictive power,while sequel factors are limited by the number of sequel films and have weak predictive power. |