New materials are the cornerstone and forerunner of high-tech,and it takes an average of 20-30 years to develop materials using traditional methods.The development of big data,artificial intelligence and high-performance computing has made the data-driven research and development of new materials a reality.Data-driven is considered to be the fourth paradigm of materials research.It can greatly shorten the research and development cycle and cost of materials.This thesis focuses on the application of machine learning and multi-objective optimization methods in materials data science.The main work and contributions are as follows:1.In order to improve the prediction accuracy of the model and avoid overfitting for high-dimensional and small-sample material data,We propose a feature selection method based on reinforcement learning(FSRL).We apply this method to the classification prediction task of amorphous alloy materials and the mechanical property regression prediction task of aluminum matrix composites.(1)Aiming at the combinatorial explosion problem caused by traversing all feature subsets in the wrapper feature selection method,we propose a feature selection method based on reinforcement learning(FSRL).The FSRL algorithm abstracts the wrapper feature selection process into a process of interaction between the machine learning model and the "environment".The reinforcement learning agent selects the corresponding features to be added to the feature subset according to the current computing reward maximization;In addition,in order to solve the problem that the direct correlation between the feature and the prediction target is low,we proposes a method to construct new features by symbol transformation.We apply the FSRL method to the constructed new features for feature selection.Finally,the above two methods are applied to amorphous alloy materials and aluminum matrix composites,respectively.(2)In the data experiments of amorphous alloy materials,the classification prediction accuracy can be improved in k-nearest neighbors,decision trees,support vector machines and random forest models after feature selection using the FSRL algorithm.The result is a maximum increase of 2.8%;In the data experiments of aluminum matrix composite materials,the feature construction method based on symbol transformation is applied to the prediction of elongation and tensile strength in mechanical properties.The prediction accuracy increased from 80.8%and 96.5%to 83.6%and 97.2%,respectively.After using the FSRL algorithm to select the features,the prediction accuracy can reach 86.2%(3-dimensional features are selected)and 98.7%(6-dimensional features are selected).These results verify the effectiveness of the method.2.The material reverse design process is essentially a multi-objective optimization problem with constraints and preferences.In this thesis,we propose an improved Non-dominant Sorting Genetic Algorithm(Ⅰ-NSGA-Ⅱ).The algorithm is applied to the multi-objective optimization of concrete materials,negative expansion materials and thermal barrier coatings to guide the experimental design.(1)Firstly,there are usually constraints on the parameters to be optimized in the material reverse design process.Common constraints are search step constraints and component ratio constraints.The traditional Non-dominant Sorting Genetic Algorithm(NSGA-Ⅱ)usually obtains solutions that approximately satisfy the constraints.Aiming at the problem of search step size constraints,we propose a coding method of linear mapping.This method makes the search process an unconstrained multi-objective optimization problem by mapping the search fields into search fields with the same search step size.Aiming at the problem of component ratio constraints,we propose a combined coding method.This method limits the length of chromosomes and the search step size of genes in each chromosome.In this way,the generated progeny solutions are all solutions that strictly satisfy the constraints after decoding.(2)Secondly,materials researchers often have expectations and preferences for material properties when designing new materials.In this thesis,we propose a NSGA-Ⅱ with preference algorithm(NSGA-ⅡP).It introduces the similarity calculation from feasible solutions to preference vectors in the process of crowding distance calculation.In this way,the solution set can be gathered in the desired direction.Based on the two evaluation indicators of generational distance(GD)and inverted generational distance(IGD),we further propse the mean cosine similarity(MCS)and the points in the sector(PIS)to evaluate the performance of the NSGA-Ⅱ.(3)Finally,through the data experiments of concrete materials and negative expansion materials,the effectiveness of the Ⅰ-NSGA-Ⅱ algorithm on multi-objective optimization problems with preference and component ratio constraints is respectively verified.Through the data experiment of thermal barrier ceramic coating materiasl,it is verified that the Ⅰ-NSGA-Ⅱalgorithm can solve the multi-objective optimization problem with constraints and preferences at the same time.Compared with the weighted particle swarm algorithm,the Ⅰ-NSGA-Ⅱ algorithm has the advantages of no need to set weights,high operating efficiency,and multiple solution sets can be obtained at one time;Compared with the NSGA-Ⅱ algorithm,the Ⅰ-NSGA-Ⅱ algorithm can obtain more preferred solutions with fewer populations.3.Machine learning method platform integration and engineering applicationCombined with FSRL algorithm,Ⅰ-NSGA-Ⅱ algorithm and common machine learning algorithm,we have realized the platform integration of material genetic engineering for reverse design of materials in the whole process.The platform uses Vue.js,FLASK and other technologies to build the front-end and back-end.Through the micro-service architecture,the front-end passes parameters to the back-end,and the back-end passes the parameters to the corresponding computing server.In this way,it realizes unified back-end calls for different services.At the same time,the front-end,back-end and computing server synchronize the service execution state through the task state query interface to realize the synchronization of each state.The platform-based integration of reverse design of materials reduces the software threshold for materials researchers and facilitates data-driven research and development of new materials. |