| Perovskite solar cells have attracted wide attention due to their high photovoltaic conversion efficiency and low manufacturing costs.The bandgap of halide hybrid perovskite absorber is a key factor affecting the efficiency of the solar cell.Developing rapid and accurate bandgap tuning methods is of great significance for further improving the efficiency of perovskite solar cells.Meanwhile,the toxicity and stability issues of lead-containing hybrid perovskites hinder their commercial large-scale applications.The traditional trial-and-error experimental method or first-principles calculation method for bandgap tuning and new material screening are time-consuming and costly.Therefore,it is of great significance to predict and discover lead-free halide perovskite materials for the long-term development of perovskite solar cells in a data-driven way.Based on this,this study proposes the use of machine learning methods to achieve rapid bandgap tuning of halide hybrid perovskites and rapid screening of lead-free halide double perovskite materials with stable structure,suitable bandgap,and high ductility.The specific research results are as follows:(1)A machine learning strategy with interpretability was constructed to achieve rapid prediction of the bandgap of halide hybrid perovskites and target component screening.Combining feature engineering,gradient boosting regression tree algorithm(GBRT),and symbolic regression algorithm(GASR),an interpretable machine learning strategy was constructed,and the high-precision bandgap prediction model GBRT-P of hybrid perovskites was established.The root-mean-square error(RMSE)of the independent test set was 0.059 e V,and the determination coefficient R2was 0.99.The physical features that have a significant impact on the bandgap were revealed by correlation analysis and GBRT feature importance ranking,including the electronegativity difference-between B-site and X-site,the electron affinity difference-,and the octahedral factor.Meanwhile,the GASR algorithm was used to mine the quantitative relationship formula of the bandgap,which was=-2+0.881-,and the accuracy on the test set was RMSE=0.084 e V.Using GBRT-P as the screening model,potential components within the theoretically highest conversion efficiency range of the ideal bandgap(1.3~1.4 e V)and the wide bandgap range(1.7~2.1 e V)with potential application in tandem perovskite solar cells were screened.Through experiments,new components with target bandgap were successfully prepared,including MA0.23FA0.02Cs0.75Pb0.59Sn0.41Br0.24I2.76(predicted value 1.34 e V,experimental value 1.39 e V)and MA0.2FA0.32Cs0.48Pb0.82Sn0.18Br1.83I1.17(predicted value 1.72 e V,experimental value 1.74 e V),which proved the reliability of the interpretable machine learning strategy.(2)A rapid screening method for lead-free halide double perovskite materials with multiple properties was achieved through transfer learning combined with first-principles density functional theory(DFT)calculations.The deep neural network model parameters of formation energy as the target variable were used as the initial parameters of the energy above hull(Ehull)and bandgap models for transfer learning,which improved the accuracy of the models.Meanwhile,to address the sparse small data problem of bulk modulus(B)and shear modulus(G),transfer learning was used to construct models with small data sets,and the independent test set accuracies were R2=0.78 and R2=0.82,respectively.Finally,based on the G/B criterion for ductility,multiple property screening was conducted using the above models to screen 54 candidate perovskite materials with stable structure,suitable bandgap,and high ductility from a huge composition space.Cs2Cu Ir F6,which had the highest predicted ductility,was selected for DFT calculation verification.The Ehull calculation result showed that it was relatively unstable,but its Perdew-Burke-Ernzerh(PBE)functional bandgap value was 1.06 e V,and G/B was0.26,indicating a suitable bandgap and high ductility. |