Research On Key Algorithms For Video Based Smart Retail Cabinet

Posted on:2020-10-29

Degree:Master

Type:Thesis

Country:China

Candidate:Pubudu Ekanayake

Full Text:PDF

GTID:2428330578466905

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Smart retail store is a hot topic in which many tech companies have shown their interest.Companies such as Amazon,Orange,Deep Blue,and IBM have already started working towards enhancing customers' buying experience with smart retail stores.Most of the available smart retail store systems are complicated and small-scale retailers face difficulties to afford the cost.When considering a small scale smart retail store or cabinet,there are certain key problems to be addressed and solved before setting up:how to get certain annotated image datasets,which contains those pictures of products taken from multiple viewpoints,for the training of supervised deep learning model for product detection;how to design a light weighted deep learning model for product detection and recognition,meeting with the primary requirements on accuracy,speed,and capacity;How to design a light-weighted machine learning to estimate the human's age and gender simultaneously from face images.After the detailed survey on the technologies used in smart retail store and cabinet,this thesis provides a practical overall framework for a prototype system of smart retail cabinet and gives some practical solutions to the above three key problems.This thesis provides a simple and efficient pipeline,which involves several steps of human-machine collaboration.The first step is to prepare the initial image datasets by taking photos of selling products from different viewpoints.Each image only contains one product and is labeled with its product's category label.Some images can contain multiple products of the same or different categories and be multiple labelled with the combination of its products' category labels.The second step is to select a small number of images from the initial image datasets.For example,to select 50 single product images for each category of product.By applying the pre-trained Mask-RCNN to each of these images,cropped images are generated,each of which contains the coordinates of its bounding box and its category label.Then,human experts help to sort out these cropped images into two subsets,one subset for correctly cropped images with the correct label,another subset for cropped images that are hard negative or background.After that,VGG16 are applied to each image of the two subsets,and the results are used to train a classification model,which will be reused soon.The third step is to generate the annotated image datasets.By applying the pre-trained Mask-RCNN,VGG16,and the classification model trained soon before,each image of initial image datasets can generate one or more correctly cropped images with the correct labels.Finally,an interactive software with a friendly human-machine interface can be used to check each cropped image with its label and then generate the image datasets with high quality.This pipeline can efficiently cut down the cost of labour.Based on the single staged object detection,this thesis introduces a custom module which helps to reduce the number of parameters of a CNN model while preserving the original accuracy of the model.The main idea of designing this custom module is trying to reduce the computational cost and capacity for storing the model weights while maintaining the accuracy which original model held.The designed custom module can be attached to almost any deep learning model by making some minor changes to the original model.The custom module I designed helped to reduce the number of parameters by 41.7 7%in the YOLO model,which is our primary concern at the moment.Further,a single lightweight model architecture was designed to estimate both age and gender simultaneously from face images.Face detection model was used to detect the faces before the estimation of age and gender.This designed lightweight model removed the requirement of two separate models,which leads to reducing the capacity of the model and inferring time.The estimated age and gender will be used to provide product recommendations in the future version.The innovation points of this thesis are:(1)A naive,yet effective pipeline and algorithms to get the first set of bounding box annotations for a custom image dataset(2)A custom module which helps to reduce the number of parameters of a model while maintaining the original accuracy.(3)A light-weighted CNN model to estimate both age and gender simultaneously from face images.

Keywords/Search Tags:

smart retail cabinet, product detection, product identification, bounding box annotation, age estimation, gender estimation

PDF Full Text Request

Related items

1	Design And Implementation Of A Product Identification System For Smart Retail
2	System Development Of Speaker Gender Identification And Age Estimation
3	Cost Estimation System For Customer-oriented Product
4	Tubular Product Size Detection Technology
5	Research On Few-Shot Retail Product Recognition System
6	Set Membership Estimation Theory Method And Its Application
7	Research On The Estimation Method Of The Likelihood Product Space Spectrum Of Coprime Matrices
8	Quantitative Research On Influencing Factors Of B2C Online Retail Product Demand Under Fuzzy Environment
9	Research On One-shot Retail Product Recognition
10	Research On Distribution Network Design Optimization For Fresh Product Under The Background Of New Retail