The identification of the chemical carriers responsible for the aromatic infrared bands(AIBs)is crucial for understanding the evolution of interstellar medium(ISM).In spite of nearly forty years of research,polycyclic aromatic hydrocarbons(PAHs)are widely considered as the primary carriers of AIBs.However,due to their diverse chemical structures,our understanding of the relationship between their structure and spectral bands remains limited,making it difficult to accurately identify the specific carriers of AIBs.In recent years,with the rapid development of artificial intelligence and big data technologies,we have obtained new approaches to overcome this challenge.In this thesis,a dataset was constructed using the NASA infrared(IR)database and self-computed theoretical spectra,with the innovative introduction of molecular fingerprints to describe chemical structures.A data-driven machine learning(ML)model was trained to investigate the relationship between PAH structures and spectral bands in depth.Through ML feature importance analysis,this study traced the sources of 171 spectral bands throughout the IR spectrum region,identified the most likely molecular structure fragments corresponding to each spectral band,and explained the correlations between some long-standing unexplained AIBs.This study breaks the pattern of previous scattered research on specific bands and achieves source tracing results for AIBs across the entire IR spectrum region,significantly improving our fragmentary understanding of the relationship between PAH structures and spectra.Specifically,the research content of this paper is described as follows:1)The IR emission features of interstellar PAHs are influenced by their chemical structures,which often undergo changes during the evolution process of ISM.However,previous studies on AIBs mostly neglected this chemical evolution,resulting in overly idealized PAH molecular configurations in research.To address this issue,this thesis first investigated the formation process of PAH molecules in ISM through molecular dynamics simulations to study the impact of chemical evolution on their IR emission features.The research results indicate that interstellar PAHs formed through dynamic evolution have complex structures,with many molecules containing non-planar and non-condensed configurations.Additionally,we observed that fullerene molecules formed different chemical structures under varying hydrogen concentration conditions.Through quantum chemical calculations,we discovered that these “non-idealized” interstellar PAHs possess unique IR emission features.Based on these findings,this thesis subsequently added some PAHs with similar structures to the dataset in the following ML research,aiming to more comprehensively cover the various chemical structures present in interstellar space.2)Furthermore,this thesis combines the IR spectral database developed by NASA Ames Research Center with theoretical spectra calculated considering chemical evolution,forming a dataset consisting of 14,124 spectra in total.We utilize extended connectivity molecular fingerprints(ECFPs)as chemical structure descriptors and train a random forest(RF)model to trace the structural origin of each IR spectral band within the range of 2.761 to1,172.745 microns.We compared these results with the literature and confirmed their accuracy.These tracing results are consolidated in an appendix table,serving as a reference tool for evaluating potential carriers of AIBs.The thesis also demonstrates how to utilize the information in the appendix table to trace certain spectral bands,such as characteristic IR bands of nitrogen-containing PAHs and superhydrogenated PAHs.These results have enhanced our understanding of the relationship between PAH molecular structures and IR emission bands,providing a robust reference tool for AIBs observation.3)Finally,this thesis also demonstrates the use of feature importance analysis based on the RF algorithm to study the physical correlation between different emission features by searching for common molecular fragments that lead to different IR spectral bands.This thesis proposes a method for quantifying band correlation by measuring the similarity of arrays of feature importance for different bands.In this way,we are able to comprehensively understand the correlation between different spectral bands in the mid-and far-IR ranges,fundamentally explaining the perplexing spectral band correlation phenomena observed.In conclusion,this thesis introduces ML algorithms and ECFPs,pioneering a novel approach to studying the IR emission features of interstellar PAHs.This method surpasses the limitations of previous reliance on quantum chemical calculations and significantly enhances our understanding of the relationship between PAH structures and IR spectral bands.The study provides a powerful analytical tool for the tracing of AIBs origins and highlights the immense potential of artificial intelligence technology in the field of astronomical spectroscopic analysis. |