Font Size: a A A

Research On Fine Grained Image Recognition Method Based On Visual Transformer And Data Optimization

Posted on:2024-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:M JiaFull Text:PDF
GTID:2568307160476574Subject:Agricultural Information Engineering
Abstract/Summary:PDF Full Text Request
Fine-grained image recognition is a popular research direction in the field of computer vision.The fine-grained images of fruits in natural scenes often have the characteristics of small inter-category differences,blurred pictures,target occlusion,similar foreground and background,and difficult to distinguish samples.Solving such fine-grained image recognition problems can greatly promote the development of the agricultural intelligent recognition industry,and it is of great significance to promote the upgrading and transformation of the smart orchard industry.From the perspective of data optimization,this paper has made the following work:Two types of grape data sets were established: 15 types of grape data sets were photographed in artificial natural scenes,and the Vitis-15 grape data set was constructed.Aiming at the characteristics of soft and fragile grapes,a set of grape image acquisition device was designed,and a glucose data set was established by taking images of red grape grapes with the experimental device and measuring the glucose content of red grapes.Fine-grained image recognition based on data optimization: Statistical analysis of CUB-200-2011 images on six competitive deep learning models found that there are difficult sample data.In order to study the characteristics and distribution of difficult sample data,this paper proposes Quantitative evaluation and correlation analysis of difficult samples based on data optimization indicators such as average least square error score,forgetting score,memory score,and dichotomous data difficulty score.Further,two data screening methods,direct deletion and balance optimization,are used to optimize difficult samples.The best optimization method is compared with other methods and typical models,the performance of data optimization metrics is evaluated,and the hard sample category under data optimization methods is analyzed.It is found that the DDD data optimization method has the best generalization and stability.Improving the classification accuracy of difficult samples can improve the recognition effect of the data set as a whole.On the optimized data set,the accuracy of the visual Transformer model can reach 91.283%,which is 0.683% higher than that of the unoptimized data set,and reduces the cost of computing resources.Grape image classification and recognition based on data optimization: Due to manual shooting errors,there will be images in the Vitis-15 dataset that are blurred,covered with grapes,and shot against the light.At the same time,the small difference in features within the category of grape pictures makes it difficult for the model to classify and recognize grape images.In this paper,the data optimization method combined with the visual Transformer model is used to classify and recognize grape images.Three data optimization methods including mean least square error score,forgetting score and dichotomous data difficulty score are applied to grape classification recognition.The visual Transformer model has a recognition accuracy rate of 97.68% on the optimized dataset,which is 0.5% higher than the original dataset.Dataset storage is also shrunk by 256.5MB.Sugar content of grapes identification based on data optimization: Because the collected grape sugar content is affected by the seasons,it presents an unbalanced distribution phenomenon,that is,the distribution of the glucose level data interval is uneven,which makes the model challenge in the glucose level detection.In this paper,the U-Net model is first used to segment the grape image,and the data set is optimized and reconstructed according to the unbalanced distribution of the grape data set.At the same time,a feature normalized reweighted regression network and a multi-label balanced loss function are proposed.Experiments show that the visual Transformer model has a correlation coefficient of 0.9599 on the test set and a mean square error of 0.3841 Brix.The efficient and economical non-destructive detection of sugar content of grapes is realized.
Keywords/Search Tags:Computer vision, Fine-grained image recognition, Deep learning, Vision Transformer, Data optimization
PDF Full Text Request
Related items