Font Size: a A A

Research And Implementation Of Few-Shot Fine-Grained Visual Recognition Methods

Posted on:2021-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:Q Z ChenFull Text:PDF
GTID:2428330647951038Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Fine-grained visual recognition technology has received increasing attention in recent years.On the one hand,this problem attracts research interest in the academic world because of the task characteristics of the small inter-class gap and large intraclass gap.On the other hand,in the real business environment,retail and e-commerce have higher requirements for fine-grained goods and merchandise identification technology.At the same time,other subjects,such as zoology and botany,also hope to use fine-grained visual classification technology to automatically identify taxonomically adjacent(like same family but different genera)species.At the same time,the scaled application of fine-grained visual technique inherently has a problem,which is the long tail distribution effect.For example,when we subdivide the general category of birds into fine-grained categories such as the pied-billed grebe and the chestnut-breasted warbler,etc.,the number of available samples for each category will be greatly reduced.What's more,when a fine-grained class is a cherished species or a newly discovered species,the number of available samples is rather small.Therefore,the problem of few-shot fine-grained visual recognition has become a cross topic of practical significance.Firstly,focused on fine-grained images,this paper proposes a hard sample retraining method that is generally suitable for metric-based few-shot learning frameworks.In order to alleviate the problem of discarding the supervised information in the metricbased few-shot learning,this paper designs a sampling mechanism that takes advantage of the parameters-frozen benchmark model as a hard samples filter and uses the filtered hard samples as the input of the training part.At the same time,this paper also designsthe loss function of the rectification mechanism to correct the misclassification of the original benchmark model to achieve the purpose of strengthening the supervision information,thereby enlarging the distance between classes of fine-grained categories.In this way,the retrained model can obtain better fine-grained image classification accuracy without structure modification.This paper also pays attention to the feature combination operator,which has not fully discussed in the existing few-shot learning method based on meta-learning,and points out the impact on fine-grained objects.Therefore,this paper draws on the methods of image descriptors in the traditional pattern recognition,and designs a task-level feature combination operator instead of the original default concatenation operation along the channel dimension.The method proposed in this paper achieves better performance while reducing the cost of calculation.Besides images,there are also fine-grained problems in the field of activity recognition.We summarize the fine-grained activities at two aspects of semantic similarity and temporal reversion.We design a metric-based few-shot activity recognition method,including an attention mechanism called temporal attention masks,and a pooling mechanism called temporal pooling,targeted on two aspects of fine-grained respectively.This paper sums up the issues that need to be concerned with few-shot fine-grained activities and proposes a model that gives a baseline model for few-shot fine-grained activity recognition,which has stepped preliminary exploratory research on this topic.In summary,this paper finds the existing problems in several segments of the fewshot fine-grained visual recognition task,and proposes some designed methods whose effectiveness is proved by experiments.
Keywords/Search Tags:Few-Shot Learning, Fine-Grained, Metric Learning, Meta-Learning
PDF Full Text Request
Related items