Font Size: a A A

Research On Some Problems Of Object Detection Based On Convolutional Neural Networks

Posted on:2018-09-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:X D LiFull Text:PDF
GTID:1318330542477554Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Object detection is a hot topic in the field of computer vision,machine learning and artificial intelligence.It has been widely applied in intelligent video surveillance,robot environment awareness,large-scale image retrieval,etc.Nowadays,the object detection methodes based on convolutional neural networks(CNNs)are the main object detection methods,which use the learned features instead of the hand-crafted features,for improving the detection accuracy.Although the researches on the CNN-based object detection have yielded fruitful results,there are still some unsolved important problems.Firstly,the current detectors lack the scene self-adaptability,which leads to the fact that these detectors cannot adapt to view changes in the surveillance scenes.Hence,the detection accuracy declines rapidly.Secondly,the current detectors lack the mechanism of memory and prediction,which leads to the fact that these detectors have no ability of memory and prediction.Therefore,it is difficult to improve the detection accuracy.In order to solve above two problems,this dissertation carries out the researches from two aspects of the CNN-based object detection,including adaptive object detection methods for surveillance scenes and accurate object detection methods through memory and prediction.Furthermore,we conduct two studies for each research aspect.The main studies and contributions are summarized as follows:(1)In view of the lack of the scene self-adaptability,we propose a transfer method through network adjustment and structure optimization for vehicle detection.Our method makes use of feature-level transfer learning,network structure optimization and datalevel transfer learning to gradually adapt the CNN-based vehicle detectors to surveillance scenes.Firstly,according to the difference of vehicle features with different views,we adjust the network parameters that cannot be used in multiple views.Then,we combine the similar feature maps layer by layer according to the similarities between feature maps in the same layer.Lastly,we select vehicle samples from generic scenes,which have the same view with ones from surveillance scenes,to help us fine-tune the network parameters.The experimental results show that our method improves the detection accuracy and the detection speed through network adjustment and structure optimization respectively.Our method successfully transfers the CNN-based vehicle detectors that obtain good performance in the surveillance scenes.(2)In order to construct the object detectors with scene self-adaptability,we propose a construction method through transferring convolutional neural networks and learning context information for object detection.Our method utilizes two steps to construct the adaptive regression model,which can predict accurate object locations in the surveillance scenes.The first step is that we select and remain the useful kernels for each convolutional layer according to the activation values of different surveillance samples.The second step is that we integrate local context information and global context information into convolutional layers and pooling layers respectively.The experimental results show that our method improves the scene self-adaptability for feature extraction and the detection accuracy for object localization through transferring convolutional neural networks and learning context information respectively.Our method successfully constructs adaptive regression models,which achieve good performance for pedestrian detection and vehicle detection in the surveillance scenes.(3)In view of the lack of the mechanism of memory and prediction,we propose a design method based on human's mechanism of memory and prediction for object detection.Our method combines one CNN and one long short-term memory(LSTM)into an integrated framework to make our model have the abilities of memorizing sequence patterns and predicting object locations.Firstly,for simulating the process of memorizing the single object,we convert an image to an image sequence,and utilize the CNN and the LSTM to memorize and recognize the sequence pattern.Secondly,for simulating the process of predicting multiple object locations,we convert the LSTM to the recurrent CNN,which can recognize multiple sequence patterns of different locations so that our model can predict object locations in the detection image.The experimental results show that after introducing the mechanism of memory and prediction,our method can successfully memorize the sequence patterns and accurately predict the object locations.Our method achieves the best performance for pedestrian detection and vehicle detection in the surveillance scenes.(4)In order to make the pedestrian detectors simulate the memory process further,we propose a design method through exchanging sequence order and memorizing sequence patterns for pedestrian detection.Our method learns both of the sequence order and the sequence patterns so that our model can memory and recognize the sequence patterns of pedestrians rapidly and accurately.Firstly,for implementing the process of order exchange,we add the step of order exchange between the CNN and the LSTM,and make use of the exchange matrix to re-sort the feature sequence by degree of importance.Then,for learning the sequence order and memorizing the sequence patterns simultaneously,we employ the BP algorithm and the BPTT algorithm to train the exchange matrix and the LSTM respectively.Lastly,for improving detection speed,we apply our model in the region-based detection framework,which recognizes the sequence patterns for each region.The experimental results show that after introducing the order exchange,our method successfully learns the sequence order and accurately memorize the sequence patterns.Compared with the state-of-the-art pedestrian detection methods,our method achieves the comparable performance in term of accuracy and speed.
Keywords/Search Tags:Object Detection, Convolutional Neural Network(CNN), Long Short-term Memory(LSTM), Transfer Learning, Memory and Prediction
PDF Full Text Request
Related items