| In recent years, people are concerned about the quality of tobacco products increasingly. Demands for the analysis of chemical constituents which determine the intrinsic quality of tobacco are put forward in the actual quality control. Analytical researchers are no longer confined to the analysis of one or several chemical constituents. They need to analyze a large number of samples with many constituents using a variety of instruments. Such method of conventional chemical analysis is complicated and time wasting and thus obviously can not meet the need of product development and quality control. Therefore, to find a rapid and efficient method for determination of chemical constituents is very necessary. In this study, the technique of near infrared spectroscopy was applied in the rapid quantitative analysis of four constituents. Our research group collected more than 400 tobacco samples from several provinces of the country. The spectra were measured and the corresponding contents of reducing sugar, total sugar, total nitrogen and nicotine were determined, establishing a near infrared database of tobacco. On this basis, the near infrared quantitative analysis models were established. However, as various constituents have different correlation coefficients with the near infrared spectra, which leads to different accuracy of the analytical method, we are required to have an in-depth study to establish the most effective analytical model by optimizing the modeling conditions. In order to obtain a robust, small error and applicable model, our research was carried out from the following aspects1) Studying on spectral pretreatments and the number of samples that influence model predictive performance. By comparing various approaches, the best spectral pretreatment is the first derivative and the second derivative. The number of samples should be as many as possible, but an excess of samples may lead to the increase of model errors.2) Comparison of local and global models. The predictive results of local models are more reliable than that of global models. Provided that some characteristics of the samples are known, for example, the principle component space of redrying tobacco samples is narrower than that of drying tobacco samples or the principle component space of these two can be separated into two types obviously, establishing a local model can give better predictive results.3) The research of ensemble modelling. Through ensemble modelling, the standard deviation of each predictive result can be calculated, providing an useful evaluation methodology for the accuracy of the quantitative analysis models.4) The relation of outlier detection and model performance. A bootstrap-based method for outlier detection was applied to obtain a robust model.5) The research of model updating. The updated model has a lower predictive error than the original model through expanding the principle component space of the samples.6) The research of model trainsfer. Three model transfer approaches were compared and they didn't have significant differences when applied to the tobacco data set. The reason is that the difference between the spectra measured on the master and the slave instrument for the same sample is not large enough. |