| As a fundamental part of the production for cigarette tobacco, the intrinsic attribute of tobacco leaf affects the quality of cigarette products. The partition for origin, location, rank plays a key role in the tobacco purchase and quality management. In cigarette production, it is important to highlight features of the tobacco by using the tobacco smoke flavor characteristics. As a green, non-destructive technology, NIR spectra analysis technology has been rapidly developed and widely used in recent years in the tobacco industry. Because of overlap NIR spectral peaks, so many spectral wavelength points, low signal-to-noise ratio, excavating the intrinsic characteristics of NIR spectrum needs the support by chemometrics methods. Thus, in the study of identification for tobacco producing area, it needs many key technologies such as spectral dimension reduction, selection and optimization for wavelength variables, pattern recognition research to support evaluation of tobacco leaves quality, improvement of tobacco cultivation measures.In order to achieve the research goals, this paper does the following aspects:1 By reading a lot of literatures, this paper states the current situation of the near-infrared-dimensionality reduction algorithms and the spectral pattern recognition algorithms. Introduce the research significance and the main contents of this paper.2 Briefly state the chemometric methods and molecular spectroscopy techniques. Describe the principles of diffuse reflection and the selection for the spectral region. Describe the process and methods for qualitative identification of near-infrared spectroscopy. Describe the implementation for the pattern recognition methods and evaluation of near-infrared model.3 Against the variable optimization of tobacco NIR spectra issue, this paper analyses and studies the issue from two angles respectively. One is the dimension reduction algorithm based on variable transformation research and the other one is based on variable screening. The experimental results show that these two ways can extract the classification of the sample information. SPA-LDA algorithm can express the information about classification and it can divide all test samples successfully by using only two wave numbers.4 Against for the classification of the tobacco origins issue, the paper uses an improved K.NN algorithm to classify unknown tobacco samples while the traditional KNN algorithm has an issue in classification. The results indicate that the accuracy of the classification will be increased by using the improved KNN algorithm. |