Font Size: a A A

Coarse-fine Grained Categorization For Vegetable And Fruit Images

Posted on:2017-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y S FengFull Text:PDF
GTID:2308330485951818Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Nowadays smart home is in fashion, which is specially designed to provide the deep awareness of the user’s requirements and offer them better home life experience without complex interactive interface and technologies. The intelligence of smart home is mainly characterized by a set of intelligent home appliances, which recently attract more and more attention benefiting from the development of computer vision, human-computer interaction, big data, etc. One of the promising applications of smart home is the smart fridges, which automatically provide the information of self-contained food materials, personalized cookbook and refined food management for fridge users. For such smart fridge applications, accurately recognizing food materials is fundamental, which involves discriminating the categories and counts of food materials. Computer vision (object detection, object classification, object counting, etc.) provides a natural and qualified solution to such real-world domain-specific requirements. Different from the general research purpose, this domain-specific application strongly demands a practical and accordant image dataset for better completing the tasks.With this application, we introduce a novel and hierarchical image dataset for the raw materials of food, named VegFru (i.e. vegetable and fruit) dataset. VegFru dataset contains almost all popular vegetable and fruit categories obeying people’s daily eating and cooking habits. VegFru dataset in the current version consists of 15 vegetable classes with 200 subclasses and 10 fruit classes with 92 subclasses. More specifically, VegFru dataset contains 25 sub-trees composed of 292 categories with more than 200 full-resolution images per category, and more than 160,000 images in total, with 91,117 images for vegetables and 69,817 for fruits.We then introduce benchmarks and baseline experiments for coarse-fine grained image categorization of vegetables and fruits.In order to compare to other benchmark datasets, verifying the difficult of the proposed VegFru dataset, we consider two state-of-the-art features to learn our multi-class classifiers. The first baseline experiment extracts the conventional hand-crafted features followed by linear kernels SVM. We extract multi-scale SIFT and color momentum (CM) from images, then encode the low-level representations into a mid-level discriminative visual signature with localized soft-assignment coding (LSA) and Fisher Vector (FV) coding strategy, respectively. Finally, the coding coefficients are fed into a linear SVM for multi-class categorization. The second baseline experiment gives the mean accuracy for vegetables and fruits using the CNN model with AlexNet, CaffeNet and GoogLeNet structure.In addition to baseline result, we propose a semantic segmentation strategy based on top-down attention map for coarse-fine grained visual categorization. We introduce attention information with CNN to detect and segment the object regions from images only with the class label annotation. The segmentations are used to train a CNN model named SegNet to achieve a better initialized weights which focuses on exploring the object foreground in images. We then fine-tune the SegNet with raw images to learn complementary information to SegNet based on the discriminative region. Finally, the resulting CNN model named branch-training-net is used to recognize VegFru dataset, verifying that combining attention information to CNN can improve the performance of coarse-fine grained image categorization.To sum up, with the application of smart home, we construct a domain-specific hierarchical image dataset for the raw materials of food. For the task of coarse-fine grained image categorization in VegFru, we introduce benchmarks and baseline experiments using hand-crafted and CNN signatures, we then propose a novel model based on segmentation strategy with top-down attention map to improve performance. We believe that VegFru can serve as a useful resource not only for the research in computer vision community, but also for the application of computer vision in the development of smart home system. We also hope that VegFru become the driven data connecting the real-world domain applications to the computer vision community, and reversely promoting the application of computer vision in the development of smart home.
Keywords/Search Tags:smart home, coarse-fine grained image categorization, CNN, top-down attention map, object segmentation
PDF Full Text Request
Related items