In recent years,machine learning has been widely used in the biopharmaceutical industry.Culture medium is the cornerstone and key link of the biopharmaceutical industry.Therefore,the research on the optimization method of culture medium formulation based on machine learning has important scientific significance and application prospects.In order to improve the speed and effect of culture medium formulation optimization,this thesis uses machine learning to carry out research on culture medium formulation optimization methods based on the real culture medium formulation data set,and establishes a complete set of culture medium formulation optimization solution by combining various prediction models and optimization methods.The main work done in this thesis is as follows:1.Carry out the research of five kinds of culture medium formulation prediction models.Aiming at the problems of low accuracy and single application scenarios of the existing models,the prediction model proposed in this thesis is applicable to various types of culture medium formulation and the highest R-Square of related tasks is 0.85.Among them,SVR is used as the benchmark model and the protein expression amount is used as the target task.The Cat Boost regression model is mostly used for small sample data sets,and its R-Square has increased by 7.2% at most;the transfer learning model is used to migrate the source task with rich data to the target task with sparse data,and its R-Square has increased by 33.5% at most;the Transformer regression model is suitable for scenarios with a large amount of data and uneven distribution,and its R-Square has increased by 4.2% at most.In addition,this thesis proposes an improved PLE model based on the residual structure and Focal L1 Loss,which is used in multi-task learning scenarios and can effectively balance the relationship between various prediction tasks.the RSquare of the three tasks which mean the cell viability,the integral of viable cell concentration and the protein expression amount has been improved by 2.2%,5.9% and3.8 %.2.Carry out the research on optimization methods of culture medium formulation.Aiming at the problems that the existing optimization methods take a long time and have insufficient global search ability,this thesis proposes improved lightning search algorithm based on chaotic sequence and conjugate gradient method and multi-objective particle swarm optimization according to single-objective optimization and multi-objective optimization.In addition,this thesis combines two optimization methods and five prediction models to optimize some culture medium formulation for different data sets.Experimental results prove that the average error between the real evaluation index and the predicted evaluation index of the optimized culture medium formulation is within10%.And the three evaluation indexes which mean the cell viability,the integral of viable cell concentration and the protein expression amount of all culture medium formulation increased by 8.7%,23.2% and 16.8% compared with the controlled culture medium formulation.Experiments have proved that the five prediction models and two optimization methods proposed in this thesis can not only achieve ideal results in the corresponding task scenarios,but also make certain contributions to the optimization of culture medium formulation and the subsequent development of the biopharmaceutical industry. |