| Thermoelectric conversion technology is promising in waste-heat recovery and solid refrigeration since the Seebeck and Pelti er effects can drive the direct conversion between heat and electricity.High-performance thermoelectric materials offer great potential for the development of energy storage and solid-state refrigeration devices.The traditional strategies for designing TE materials always rely on the first three paradigms of science,i.e.,empirical evidence,scientific theory,and computational science,and many important experimental and theoretical contributions have been achieved.With the development of the fourth paradigm of scientific research:data science-driven theory and the proposition of materials informatics,machine learning has gained attention in the thermoelectric community since it can construct a statistical relationship among the data and give a prediction.This work respectively built the supervised,unsupervised and semi-supervised models based on the experimental and computational data to design the new thermoelectric and investigate the ir thermoelectric performance.Supervised learning is used to establish the regression model for predicting thermoelectric quality factor to accelerate the design of 1-2-2 type Zintl thermoelectric materials.Based on the three-band model,an advanced quality factor is derived that comprehensively considers carrier mobility,density of states effective mass,lattice thermal conduc tivity,band gap and inter-band offset.The quality factor is proportional to the optimal z T values if the carrier concentration is optimized,which has been confirmed on the materials with multiple bands.We collected 161 distinctive compounds with the quality factor from published papers related to 1-2-2 type Zintl thermoelectric materials and generated 159 statistics descriptors of elemental parameters.Fo ur algorithms were used to build models and the gradient boosting decision tree algorithm performed better.The R2 was improved from 0.666 to 0.800 through the feature engineering involving feature generation,Pearson correlation coefficient analysis and recursive feature elimination,and the feature importances were subsequently evaluated.Finally,a series of AMg2(Bi,Sb)2 based components with a higher quality factor of greater than 7.5 were recommended,which are expected to obtain high z T values by optimizing the carrier concentration.An iterative unsupervised machine learning strategy was proposed to discover and design a series of promising half-Heusler thermoelectric materials.456 half-Heusler compounds and 484 characteristics describing thermoelectric performance were recognized from the total database in the Materials Project,among which 20 materials have been experimentally reported to be excellent TE materials.The unsupervised clustering technique grouped 456 half-Heusler compounds into different clusters based on generated 484 f eatures and iterated the model according to the clustering results of the 20 experimentally reported TE materials.Finally,20 new potential thermoelectric materials were obtained.We optimized the thermoelectric performance of p-type and n-type Sc Ni Sb by the experiment and obtained the single phase,respectively.The peak z T values of~0.5 at 925 K in p-type Sc0.7Y0.3Ni Sb0.97Sn0.03 and~0.3 at 778 K in n-type Sc0.65Y0.3Ti0.05Ni Sb were experimentally achieved.Positive and unlabeled semi-supervised learning was designed to explore new thermoelectric materials.4610 inorganic materials and 231 features were determined from the Materials Project.The reported thermoelectric materials with positive labels were found by matching the 4610 formulas with the research paper titles that are screened from the web of science through the predefined keyword:thermoelectric.958 thermoelectric materials were determined and the remaining 3652 materials are unlabeled samples.Bagging-based positive and unlabeled learning was used to build the model and 64 potential thermoelectric materials were identified.The transport properties were calculated by high-throughput first-principles calculations and the majority of materials have the maximum power factor of greater than 40μW cm-1 K-2.Following the Cahill-Watson-Pohl model,the minimum lattice thermal conductivit ies were estimated and were used to obtain the z T values.8 n-type and 22 p-type materials have maximum theoretical z T values greater than 2.Specifically,a series of AX2binary compounds,Al2Zn(Te,Se)4 and Zn(Ga(Te,Se)2)2)ternary compounds,and Ga Cu Ge Se4 quaternary compounds deserve a future investigation in the future. |