Font Size: a A A

Feature Extraction And Analysis Of Thermoelectric Materials Based On Machine Learning

Posted on:2023-12-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:L H ChenFull Text:PDF
GTID:1521306911994899Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
The big data analysis of materials based on machine learning can directly predict the physical and chemical properties of thermoelectric materials that directly reflect the performance of thermoelectric materials,which can solve the problems of tight computing resources and lack of experimental data.At this stage,building a material data genome and building a material database has become an inevitable development trend with the rapid development of machine learning and the advent of the era of big data.Materials science based on big data environment can greatly promote the research of new thermoelectric materials rapidly and efficiently.However,it is a major challenge for data science researchers to open the black box between material properties and material characteristics by classifying and reconstructing thermoelectric material data more rigorously and mining the detailed microscopic information of electron arrangement,atomic species,molecular structure,and energy distribution in the material eigenvector group with the huge and complex data set of thermoelectric materials.In order to further expand the application scope of machine learning in the field of computational materials,it is necessary to construct various types of eigenvector groups from multiple dimensions,and combine the integrated machine learning method with the thermoelectric properties of materials to achieve efficient prediction of Seebeck coefficient,conductivity,and power factor.Feature selection and feature extraction are performed on multi-dimensional material features by integrating the weighting factors returned by machine learning methods.A high-order composite machine learning system is constructed to realize the correlation analysis between material features in different material spaces,and to reveal the hidden connections between the internal features of materials.The main results are as follows:1.Calculate the Seebeck coefficient,conductivity,and power factor of materials with n-type and p-type carrier concentrations between lExl6cm3 and 1 Ex20cm-3 at temperatures from 100K to 1300K through the density functional theory of first-principles-based simulation calculation software such as VASP,to accumulate initial thermoelectric material performance data.At the same time,the energy band structure data obtained after parameter optimization and static relaxation process are used as the initial energy distribution data of the material.2.The element feature information,atomic composition information and energy band structure information of materials are used as the main three dimensions of feature extraction.Among them,the characteristics representing the characteristic information of the material elements are composed of the element ratio information and the position information of the periodic table of elements.The element ratio information directly feeds back the internal element composition of the material,while the position information of the periodic table reflects the strength of the metallic and non-metallic properties of its main group.The eigenvector group constructed based on the atomic composition information is mainly composed of three aspects:the physical properties of the atom itself,the atomic extranuclear electron weighting information and the atomic compositional crystal structure information.The band structure information summarizes various characteristics that affect the thermoelectric performance in terms of the position,complexity,and effective mass of the energy bands in the band structure map of materials.Constructing material features from three dimensions can comprehensively analyze the relationship between material features.3.Compare and analyze the advantages and disadvantages of supervised learning,unsupervised learning,semi-supervised learning and reinforcement learning in computational analysis of materials.The integrated machine learning method Xgboost boosting tree with feature weight calibration characteristics is selected as the main machine learning method to continue the feature calibration for the features of the thermoelectric material dataset.The Seebeck coefficient,electrical conductivity and power factor of the material are trained and predicted using element feature information,atomic composition information and energy distribution information as feature vector groups.The features of the material are calibrated by the feature weighting factor fed back by the Xgboost method,and a high-level analysis is performed on the new feature vector group after feature extraction.4.Construct a multi-channel machine learning system MMLS to explore the relationship between the internal properties of thermoelectric materials in different dimensions.The composite machine learning model is used to extract the data features of the material in different dimensions hierarchically by predicting the thermoelectric properties of the material,and select the most important material features of each dimension and transfer them to the top-level analysis mode of the high-level machine learning system to discuss.In the big data environment,materials under the same temperature and carrier concentration environment will have completely different performances depending on the type and internal structure of the material.Therefore,a case-by-case mode should be added to the top-level analysis mode of a high-level machine learning system to provide the applicability of the machine learning system in a variety of situations.Among them,the maximum layer distance of the material atomic information feature group after multiple feature extraction has the highest weight factor,while the curvature of the material energy band structure map and the position of the extreme point have a higher weight factor.According to the position of the energy band structure feedback on the energy band map,the relationship between the Fm3m space group and the P4/nmm space group energy band structure characteristics and the maximum interlayer spacing has a commonality such as the difference between the position of the first top of the band and the top of the second band.There are also differences,including the top position of the first price band,the number of extreme points,the degeneracy of the price band,and so on.Among them,the band gap and valence band degeneracy occupy a similar dominant position as reflected by the feature weight in the example,which verifies the accuracy of the results of the MMLS in analyzing the relationship between the energy band structure features and the intrinsic features of the material..This thsis uses machine learning as the analysis method,the thermoelectric properties of materials as the bridge,and the multi-dimensional artificially constructed features as the cornerstone of understanding the relationship between features.Through the feature extraction and analysis of thermoelectric materials based on machine learning,the physical property relationship between different dimensions within the material can be deeply excavated,and the advanced construction of machine learning models in the context of big data can be realized.
Keywords/Search Tags:machine learning, thermoelectric performance, feature weight, correlation analysis
PDF Full Text Request
Related items