| Retail industry has experienced three rapid changes:the department store that appeared in 1852,the supermarket that appeared in 1859,and the chain stores that appeared in 1930.These major changes have brought about changes in social production and lifestyle.For the retail industry,2018 is both a turbulent and challenging year.Artificial intelligence has pushed the retail industry to the top of the wave.At the same time,the new retail has subverted the conventional fields,improved efficiency and reduced operating costs,which have brought great impact on the traditional retail industry.Therefore,in the environment of big data,traditional retail industry needs to use big data technology to help it correctly cope with the impact of new retail.The era of big data is also beneficial to the development of retail industry.Now,with the help of big data and AI technology,retail industry can achieve more refined and intelligent management.So,the traditional retail industry should begin to pay attention to the information brought by data.The data used in this paper are about the historical sales data and some other information of 99 departments of 45 Wal-Mart stores from 2010/2/5 to 2012/10/26.Because the source data does not have the true value of the test data set(that is,from October 2012 to October 2013),we cannot consider the test data in the model,but divide 80%of the data in the train as the training set,and 20%as the test set to build the model.With a given data set,we are getting the output of weekly sales from the input of weekly sales,and then analyze the optimal model for this data.The main models used in this paper are exponential smoothing,ARIMA,extreme random forest method and XGBoost.The established time series models are:(1)time series linear model with seasonal dummy variables;(2)STL decomposition+exponential smoothing model;(3)STL decomposition+ARIMA model;(4)ARIMA model with seasonal term.When we use exponential smoothing and ARIMA model modeling,the data is treated as multiple sets of time series data,and it needs to be preprocessed before modeling.Holidays in the data are more important information.In addition to analyzing the fluctuations in sales caused by holidays,the paper also uses singular value decomposition to extract the relevant information between departments and ignore the sales fluctuations that only appear in a single department,thus achieving data reduction and denoising.At this point,the choice of dimension is very important.The results of tests show that when the dimension of SVD is 10-15,the effect is the best.By comparing the forecasting indicators,we can see that The weighted average of the forecasting results of the above models are better than that of the single model,and the best forecasting effect of the single model is the STL decomposition+exponential smoothing model.When using the machine learning algorithm to build a model,the data is not treated as a time series.Instead,combining the data of stores data set,feature data set and train training set,the factors related to sales,such as the size of stores,the existence of holidays,temperature and so on,are taken into account in the model.And the most important is to use month in this dataset as a factor to model.Comparing the prediction results of machine learning algorithms with the weighted average time series model,it is found that the XGBoost with Over-adjusted Parameters has the best results when predicting store sales.A literature has proposed that for Wal-Mart sales data,based on Department-level model prediction.With the appropriate transformation of the data,we built the XGBoost model against Department-level model and found that the prediction performance of the XGBoost algorithm is also better than the model results of the previous literature.Through the data mining process of Wal-Mart’s sales data,we found that the prediction effect of XGBoost model is best.At the same time,better prediction results can be obtained,if the parameters of the model are further optimized,or XGBoost is combined with other algorithms.I believe that with the deepening of the research on retail sales forecast in the future,there will be better algorithms for this problem.The solution will not only solve Wal-Mart’s own supply turnover problem,but also have reference significance for domestic supermarket chains. |