Font Size: a A A

Research On Rank-based Pooling For Deep Convolutional Neural Networks

Posted on:2018-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z L ShiFull Text:PDF
GTID:2348330515464655Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Pooling is a key mechanism in deep convolutional neural networks(CNNs)which helps to achieve translation invariance.Numerous studies,both empirically and theoretically,show that pooling consistently boosts the performance of the CNNs.The conventional pooling methods are operated on activation values.In this thesis,I alternatively propose rank-based pooling.It is derived from the observations that ranking list is invariant under changes of activation values in a pooling region,and thus rank-based pooling operation may achieve more robust performance.In addition,the reasonable usage of rank can avoid the scale problems encountered by value-based methods.The novel pooling mechanism can be regarded as an instance of weighted pooling where a weighted sum of activations is used to generate the pooling output.This pooling mechanism can also be realized as rank-based average pooling(RAP),rank-based weighted pooling(RWP)and rank-based stochastic pooling(RSP)according to different weighting strategies.As another major contribution,I present a novel criterion to analyze the discriminant ability of various pooling methods by introducing discriminant entropy.In this thesis,the proposed rank-based pooling is evaluated on image classification task and crowd counting task.In image classification task,experimental results on four image benchmarks(MNIST,CIFAR-10,CIFAR-100 and NORB)show that rank-based pooling outperforms the existing pooling methods in classification performance.I further demonstrate better performance on CIFAR-10 and CIFAR-100 datasets by integrating RSP into Network-in-Network.In crowd counting task,motivated by these problems including camera perspective,uneven distribution of crowd density,background clutter and occlusions,a new crowd counting method is proposed based on rank-based spatial pyramid pooling(RSPP)network.In the proposed method,the original image is divided into several sub-regions with the same scope of perspective,and then multi-scale sub-image blocks are respectively taken from different sub-regions.Rank-based spatial pyramid pooling network is used to get the numbers of pedestrians in sub-image blocks.Then summing the numbers of persons of all sub-image blocks gives the total number of people on the image.Experimental results on UCSD benchmark show that the proposed method has the advantages of high accuracy and good robustness compared with traditional methods.
Keywords/Search Tags:Pooling, Deep learning, Convolutional neural network, Image classification, Crowd counting
PDF Full Text Request
Related items