| Quantitative Structure-Activity Relationship (QSAR), which investigates the quantitative relationship between the molecular structural parameters and biological activities or dependent functions, is one of the most important computational method and common technique for drug design. In recent years, great impetus has been made by QSAR to the development of organic synthesis chemistry, medicinal chemistry and drug design, while it is proved to be a powerful tool for correlating molecular structure with their physicochemical properties and bioactivities and to seek reasonable interpretations. As an emerging branch of QSAR, Quantitative Structure-Spectrum Relationship (QSSR) is referred to the process that spectrum data obtained from instrumental analysis are theoretically simulated by QSAR methods. However, due to their own complexity and diversity, spectrum data are not merely in linear relation with structure, thus being difficult to be correctly predicted and simulated. In this context, a helpful discussion has been attempted, i.e. several kinds of spectrum behaviors of organic compounds and biomolecules are deeply researched by some new molecular structural characterization methods.In this thesis, based on the 2D information of molecular structure, atomic electronegativity interaction vector (AEIV), molecular electronegativity interaction vector (MEIV) and molecular electronegativity interaction vector with hybridization (MEHIV) are applied and extended. In the modeling process, several modeling methods, such as multiple linear regression (MLR), stepwise multiple linear regression (SMR), genetic algorithm (GA), partial least squares (PLS) regression and support vector machine (SVM) are utilized to establish the QSSR models for organic compounds, and most obtained models have comparable or superior quality compared with literatures. The main contents are as follows:①Based on molecular two-dimensional topological structures, both atomic electronegativity interaction vector (AEIV) and atomic hybridization state index (AHSI) were developed for expression of chemical microenviroment and atomic hybridation state. By applying AEIV and AHSI to characterize a great deal of equivalent carbon atoms of 42 acridone alkaloids and 24 quinolinones, multiple linear regression model is constructed to simulate nuclear magnetic resonace chemical shifts of 13C atoms. The correlation coefficients R of modeling estimation and leave-one-out cross-validation RCV are 0.957, 0.956 and 0.983, 0.981, respectively. Applying these descriptors to characterize 375 equivalent resonant carbon atoms of 35 naphthalene derivatives, the correlation coefficients R of modeling estimation, leave-one-out cross-validation RCV and root mean square RMS are 0.951, 0.949 and 5.365, respectively. Then by strict statistical diagnosis, the model is confirmed to be stable and predictable.②Taking the effects of various hybridization on atomic electronegativities into account, a novel electropological descriptors, called Molecular electronegativity interaction vector with hybridization (MEHIV), has been developed to describe the atomic hybridization state in different molecular environment. Five quantitative models by MEHIV characterization and multiple linear regression modeling were successfully established to predict reduced ion mobility constants (K0) of alkanes, aromatic hydrocarbons, fatty alcohols, fatty aldehydes and ketones and carboxylic esters. The correlation coefficients R were 0.915, 0.926, 0.978, 0.978 and 0.990, respectively, and the standard deviations SD were 0.044, 0.053, 0.042, 0.034 and 0.030, respectively. These results suggested that MEHIV is an excellent topological index descriptor with many advantages such as straightforward physicochemical meaning, high characterization competence, convenient expansibility and easy manipulation.③In order to further discuss the applicable fields of MEHIV, it is also used to characterize the molecular structure of 190 doubly protonated peptides and correlated with their ion mobility spectrometry collision cross sections. A quantitative model is successfully developed by GA and PLS regression. The constructed PLS regression model are subjected to rigorous double internal and external validation, indicating the model is robust and predictable, with statistics on both training and testing set as R2=0.957, RMS=8.378, RCV2=0.954, Rpred2=0.978 and RMSpred=10.298, respectively. The results show that MEHIV correlates well with collision cross section, mainly linear and somewhat nonlinear relationship.④A novel electrotopological descriptors, molecular electronegativity interaction vector (MEIV) is employed to studies on retention behaviors of 100 polycyclic aromatic hydrocarbons (PAHs), 62 polychlorinated naphthalene (PCN), 117 nitrogen containing polycyclic aromatic compounds (N-PACs) and 90 sulfur compounds, with good QSRR models having both R and RCV above 0.98. Deeply testing estimated stabilities and generalized abilities by both internal and external exams, MEIV is thus deemed to be adaptable to diverse molecular systems.⑤Considering the recent development of QSAR model validation and modeling methods, MEHIV is also utilized to characterize the molecular structure of 72 peptedes and correlate with their HPLC retention time (RT). A data set of peptedes is selected and divided into a training set with 52 samples and a test set with 20 samples. Stepwise multiple linear regression (SMR), genetic algorithm (GA), partial least squares (PLS) regression and support vector machine (SVM) are used to correlate the molecular structure with retention time in order to have a comparative analysis. The good results show that MEHIV can be used to well express the structures of peptedes. It is found that the relationship between retention behavior of peptedes and MEHIV vectors is mainly linear, also containing a little nonlinear factors. |