| In recent years,the continuous advancement of economic globalization and the constant development of digital technology has broadened the sales channels of enterprises,increasing the diversity and uncertainty of demand.When facing the demand for multi-channel and multi-category products,enterprises need to constantly adjust the inventory strategy of various products,respond to customer needs in a timely manner,and improve service levels.At the same time,highly responsive services also bring the problem of high inventory costs.In the past,a large amount of business data generated by enterprise operations was difficult to process,and enterprises could not effectively analyze the changing trend of the market environment from a large amount of data and could not detect the demand change trend and react in time,which may lead to high inventory costs.The rapid development of computer technology provides a solution to this problem.Machine learning technology takes information from data,continuously learns and improves performance,enabling timely responses to changes in the environment.As a result,machine learning techniques can help businesses make inventory decisions.Inventory control is a sequential decision problem,and deep reinforcement learning in machine learning is often used to solve such problems.Deep reinforcement learning uses Markov’s decision-making process to model the supply chain,treating supply chain members as agents,the supply chain environment as the state space of the model,and the ordering decisions made by each member as the action space of the model.Through continuous interactive learning between the agent and the environment,the agent will eventually learn a decision-making scheme to make the model achieve the desired effect.This paper reviews the relevant literature on multi-dimensional inventory classification,static inventory control model and dynamic inventory control model at home and abroad and constructs a multi-demand type inventory control model based on APIOBPCS(Automated Pipeline,Inventory and Order Based Production Control System)and deep Q network.With the goal of minimizing the total cost of supply chain inventory,the model is simulated and analyzed using simulation data and actual data.It is found that the multi-demand inventory control model based on a deep Q network can control the inventory cost of four demand types of products well,and through the example analysis of actual data,the inventory strategy derived by the model can also effectively reduce the total inventory cost of the supply chain.In addition,the model can continuously improve performance through learning,which can help decision-makers in real-world applications in the enterprise.The innovations studied in this paper are as follows:(1)Considering the different demand types of inventories,distinguishing the demand types of inventories according to the multi-dimensional inventory classification method,the applicability to the actual data is stronger,and the data generated by the operation of the enterprise can be effectively described.(2)In order to alleviate the instability of the supply chain and the consumer demand in the big data environment,the "dimension curse" problem caused by traditional reinforcement learning to solve the supply chain decision-making problem is solved by combining reinforcement learning and deep learning to model the supply chain inventory control problem.(3)Starting from actual data,data-driven inventory control can improve the utilization rate of data and data mining ability of enterprises,and provide theoretical support for managers to formulate inventory strategies and implementation methods. |