| Near-infrared (NIR) spectra can carry abundant sample information. They are with high stability and easy to collect, especialy, in NIR diffuse reflectance analysis, no chemical pretreatment was required. Therefore, NIR spectroscopy analysis method is regarded as a rapid, non-destructive and non-polluting method. Due to the wide application of computer technology and the continuous development in chemometrics, the NIR analysis method is becoming more and more popular in many fields. However, a reasonable standard, stable and accurate spectrometer, accurate measurement technology for sample component concentration and experienced scientists are needed for excellent NIR models. It is difficult to meet the above requirements for ordinary organizations. Therefore, standard NIR spectral information centers are necessary to promote the application of NIR analytical results.In this work, apples were selected as the targets of interests, and spectral matching algorithms (SMA) of apple NIR spectra were studied. A new spectral matching algorithm with full spectra (SMA-FS) based on the jaccard similarity coefficient (JSC) has been established. It is the first time to explore the available spectral matching method based on the shape of curve. In addition, an automatic peak detection algorithm was also studied. A prototype apple NIR SDBS has been developed to test the new explored algorithms.The main contents and results are summarized as follows:(1) An automatic peak detection algorithm was proposed and the parameters included in the algorithm were optimized. The commonly used spectral smoothing algorithms usually bring about large deformation in peak bands and large bias of peak parameters and can not meet the demands of automatic peak recognition. Thus, a new spectral smoothing algorithm based on weighted spectrl data points was proposed. In this algorithm, the center datapoint of a slide window with a fixed width was weighted according to the fluctuation frequency in the slide window. Where, all of the weights were normalized into0-1and large weights mean low noise levels while small weights mean high noise levels. The optimal weight threshold (WT) and window width (WW) were explored. The results of this study showed that:the noise content of the smoothed spectra did not change siginificantly when WT was larger than0.5and the peak band were best protected when the WW equaled to21. Peak width threshold (Tpw) and peak shape threshold (Tps) were adopted to filter pseudo peaks (PP) which were flat or narrow. In total,20different levels of Tpw from3to41were tested and the results indicated that all narrow PP were eliminated when Tpw reached29. Following this, a Tps equaled to0.005was used to filter flat PP. Effect of resolution on peak detection was also studied. Seven different levels of resolutions from2to128cm-1were tested and the results showed that resolution in the range of32~64cm-1was ideal for peak detection. It was concluded that apples NIR spectral peaks could be automatically detected under conditions that resolution was32cm-1or64cm-1, WW was21, WT was0.7, Tpw was29and Tps was0.005. The recognition rate of peak at5150cm-1was100%and peak at6900cm-1was99.50%.(2) SMA-P for apple NIR spectra was studied. The numbers of peaks, peak positions, peak areas and peak shapes have been used as the spectral matching indexes. The ability of SMA-P to distinguish between different spectra with peak information was validated. In total,400samples from4classes (100in each) were selected as the reference group and5samples were randomly selected from each class as the target group. The results indicated that different spectra could be distinguished from each other with their peak width and peak area. Based on this conclusion, classification tests were done with peak width and peak area. The slassification accuracies were47.25%and55.00%. So, it was concluded that the SMA-P could not be applied for sample classification in apple NIR SDBS.(3) SMA-FS for apple NIR spectra was studied. Normal SMA-FS, including absolutely distance (AD), square derivative (SD), euclidean distance (ED), correlation coefficient (CC) and spectral angle (SA) were used to distinguish different samples. The test datas introduced in (2) were also used for this test and the results indicated that all of these five methods could distinguish different spectra accurately. The classification accuracies were65.50%,66.00%,73.00%,64.75%and62.75%, which were much higher than the accuracies of the SMA-P. However, these algorithms were also unable to meet the requirements of SDBS for the accuracies were still not high engough. This might because the normal SMA-FS relied on spectral absolute intensity to much. Hence, a new SMA-FS based on JSC (SMA-JSC) was proposed to match spectra by the shape of spectral curves. In this method, the monotonicities of spectral curves in the same bands were compared for the similarity among different spectra. The experimental results showed that the new proposed algorithm could distinguish different spectra accurately, with the calibration accuracy being94.50%and the validation accuracy being95.00%. Futher verification was done with another group of datas which contained300samples of3classes (100in each). The calibration accuracy was93.67%and the validation accuracy was93.33%. Also, the mixing datas of these two batches were also used for the test, in which the calibration accuracy was94.14%and the validation accuracy was94.29%. A comparsion has been done between the SMA-JSC and the discriminant analysis (DA). The average classification accuracies between two varieties reached98.60%(raw spectra),95.90%(the first derivate) and96.30%(the second derivate). However, as the classes increasing, the accuracies declined rapidly and the classification accuracy of the total seven classes of samples droped to88.00%,56.40%and58.40%. It was concluded that the SMA-JSC was with high accuracy and stable and it was far superior to normal SMA-FS and DA method. SMA-JSC is optimal for classification task in apple NIR SDBS.(4) Standards of spectra which could be uploaded into the apple NIR SDBS were drawn up. These standards were based on previous research results, professional knowledge and our own conclusions, mainly involving four factors, i.e. sample pretreatment, spectrometer, spectrometer configuration and experimental environment. This work provided foundation for the reliability and the reference values of information stored in the apple NIR SDBS. Finally, an apple NIR SDBS prototype system was developed based on the above researches. |