Font Size: a A A

Semantic Image Segmentation And Object Detection In Autonomous-Driving System

Posted on:2019-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:J S GuoFull Text:PDF
GTID:2348330569487416Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As artificial intelligence starts leading the trends in various industries,computer vision,the area in which technologies can be most directly applied to all walks of life,has become a key interest in scientific research.Meanwhile,the application of deep convolutional neural network in computer vision has pushed this area of research into a new era.In today's world,the principal application with regards to the combination of computer vision and deep learning is the perception of driving environment in autonomous driving.Environment perception includes high level semantic segmentation,multi-object detection,pedestrian skeleton recognition,lane detection,and multi-object tracking,etc.Among these,high level semantic segmentation,multi-object detection and pedestrian skeleton recognition are the top three challenging and crucial tasks.For the environment perception in autonomous driving,the ideal status is paralleling all these tasks simultaneously with polytype and structured outputs.The research area of this dissertation is the application of deep learning neural network in environment perception of autonomous driving,which is one of the major research areas of the industry.As such,this dissertation focuses on high level semantic segmentation,multi-object detection and pedestrian skeleton recognition.Meanwhile,this dissertation has carried out researches on multi-network fusion and neural network compression.To improve the prediction accuracy and visual performance of multi-object detection,this dissertation has proposed a novel structure,called dynamic residual network,by extending the traditional residual neural network to a new level.By adding filters to the bypass structure in the network,dynamic residual neural network is able to self-adapt to various road conditions and to optimize the final result.Among the processing,this dissertation gets training,validation and testing samples by data acquisition and data annotation.With respect to the human skeleton extraction,the algorithm that is based on RGB image is proposed.From bottom to top,the algorithm first detects all the keypoints of a human.Given the locations of the detected keypoints,the algorithm connects these keypoints according to part affinity fields,so that human skeleton can be extracted from the image.In autonomous driving,human skeleton extraction is crucial to pedestrian behavior prediction and police officer pose recognition.With regards to the pixel mis-classification,this dissertation discusses the possible reasons of the issue by observing the characteristic of such problematic outputs.Accordingly,a module named as “multi-scale pooling and concatenating” has been proposed,which is capable of extracting regional information of various pixel areas and integrating these regional information with other detailed information extracted by the original network.To improve the performance,different experiments have been conducted for extracting regional information.Moreover,the results of testing images are provided for demonstrating the effectiveness of the proposed module.Due to a series of problems,including the oversize of the model volume,high demand of computing capability and spatial-temporal synchronism of multi-task,a novel reinforcement network called Mix Net has been proposed.Thereinto,the root part realizes sharing former network to extract low level semantic information,which decreases the volume of the model and ease the demand of high computing capability efficiently.The branch part extracts precise high level semantic information for different tasks,which guarantees the performance of the model.One of the key benefits of doing this is to ensure the performance(measured in fps)when the algorithm is running in real time.Besides,decreasing the volume of the model will also be helpful when transplanting the algorithm to other mobile platforms.
Keywords/Search Tags:Deep Learning, Object Detection, Skeleton Recognition, Semantic Scene Parsing, Driving Environment Perception
PDF Full Text Request
Related items