Font Size: a A A

Research On Identification And Classification Of Variable Sources Based On Machine Learning

Posted on:2022-11-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:T T XuFull Text:PDF
GTID:1480306755993279Subject:Astronomy
Abstract/Summary:PDF Full Text Request
The identification and analysis of variable sources is a hot issue in astrophysical research and plays an important role not only in understanding stellar physics,but also in studying the overall structure and evolution of the Milky Way.The Large Sky Area Multi-Object Fiber Spectroscopic Telescope(LAMOST),a large scientific installation in China,has accumulated massive spectral data containing low and partial medium resolution data.However,the lack of information about variable sources in these data limits the study on variable sources based on LAMOST data.Although the research on variable source identification based on LAMOST data has made good progress in recent years,there are still obvious deficiencies in the accuracy of variable source types and the number of identified variable sources.In the past decade,the development of various international large time-domain astronomical observation programs has brought new opportunities for identification and analysis of LAMOST variable source.Meanwhile,cross-matched analysis using observation data from multiple surveys has become one of the important ways to identify and analyze variable sources.The Zwicky Transient Facility(ZTF),as a new generation of time domain survey,has accumulated billions of light curves of sources since its official operation,and its large field of view and faint of limiting magnitudes have shown advantages in the identification and classification of variable sources.In addition,the high-precision photometric information provided by the Gaia space telescope is important parameter for variable star studies,and combined with light curves can effectively improve the classification accuracy of variable stars.The core research objective of this paper is identification and classification for LAMOST variable source.The main research idea is to combine the observations obtained from the LAMOST,ZTF and Gaia surveys for the identification and classification study of variable sources.The research focuses on the construction of the variable sources identification model,the identification and analysis of LAMOST stellar variable source candidates,and the classification of variable sources data based on machine learning.(1)Construction and evaluation of identification model for variable sources: A reliable sample set is firstly constructed from the variable sources data in Kepler and the non-variable sources data in SDSS,and the light curve data corresponding to the sample set is obtained by cross ZTF.Then,10 reliable variability parameters are selected for statistical modeling based on the light curve data.And the obtained variability parameter models are comprehensively evaluated and analyzed.Finally,the optimal variability parameter model are obtained for the subsequent identification and analysis of the variable sources.Meanwhile,the bias of the constructed sample set data,the nature and correlation of the selected variability parameters,and the advantages of identification method for variable sources are discussed in detail.(2)Identification and analysis of LAMOST stellar variable source candidates: At first,the low-resolution data in LAMOST DR6 are crossed with the light curve data of ZTF DR2 under certain conditions,and to obtain the light curve information corresponding to the high-quality LAMOST observational sources.Then the LAMOST stellar variable candidates are identified based on the optimal variability parameter model obtained in the variable source identification model construction.And finally,a catalog of LAMOST variable sources(including 631,769 variable source candidates with a probability greater than 95%)is obtained.On this basis,this study further verifies the correctness of the catalog through cross-matched with the other variable source catalogs.We performed a two-by-two cross-comparison with the GAIA catalog and other published variable source catalogs.We achieved the correct rate ranging from 50% to100%.Among the 123,756 sources cross-matched,our variable source catalog identifies85,669 with a correct rate of 69%.The results indicates that the variable source catalog presented in this study is credible.And the analysis of the types obtained by crossvalidation of variable source catalogs shows that our variable source identification model can effectively identify the main types of variable stars.(3)Classification of variable sources based on machine learning: A reliable sample set containing 10 subcategories of pulsating variable stars,rotating variable stars,and eclipsing binaries is first constructed from the periodic variable source catalog published based on ZTF data.Then,a feature extraction study is performed based on the light curve data.And the sample data are crossed with the Gaia photometric parameters to obtain more key information that can express the variable source characteristics.At the same time,two classification models,Random Forest(RF)and XGBoost algorithm,are mainly applied to the classification of variable source data.Finally,by comparing the classification results with other models and related studies,it is concluded that the RF and XGBoost algorithms have better performance in variable source classification and can largely improve the classification accuracy of variable source sub-classes.At the same time,the RF and XGBoost algorithms are applied to the class prediction of LAMOST stellar variable sources catalog data.And finally,a catalog that containing294,108 variable sources with plausible type labels is obtained at a prediction probability greater than 0.391 for subsequent studies.In general,a reliable identification model of variable sources is constructed and applied to the LAMOST survey project.And then,a credible catalog of stellar variable source candidates is obtained.Eventually,the different machine learning algorithms are applied to the classification of variable sources.The proposed variable source identification method is easy to compute and is a general identification method applicable to different variable source types,which will be beneficial to the identification and analysis of variable sources in more large time-domain astronomical observing facilities in the future.At the same time,the final LAMOST stellar variable source candidates catalog obtained in this study also provides data support for subsequent data analysis and mining of special objects in variable sources.In addition,with the increase of astronomical data,the feature extraction method and machine learning classification model used in this paper provide a reliable approach for the category identification of variable sources.
Keywords/Search Tags:Variable Source, Variability Parameter, Statistical Modeling, Light Curve, Identification, Feature Extraction, Machine Learning, Classification
PDF Full Text Request
Related items