Font Size: a A A

Heterogeneous Data Fusion Based Semantic Road Scene Understanding

Posted on:2016-11-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:W Q HuangFull Text:PDF
GTID:1108330482972519Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The computer vision based semantic road scene understanding is an important supporting research on autonomous driving system and other applications of artificial intelligence. There are three popular and difficult research topics in this area. Firstly, in order to improve the ability of scene understanding of autonomous driving vehicle, researchers focus on camera and lidar data fused semantic road scene understanding. However, the low resolution of lidar data makes it is difficult to fuse the above heterogeneous data and obtain pixel resolution scene understanding result. Secondly, jointly solving two or more problems can improve the scene understanding result. In general, joint problems are difficult to solve, because of their modeling complexity and computational complexity. Thirdly, the asymmetric and high-order temporal relations among variables in image sequence make the temporal semantic scene understanding a difficult problem.Based on the above mentioned problems, this dissertation focuses on the study of heterogeneous data fusion based semantic road scene understanding. In this dissertation, through fusing heterogeneous data on different level, jointly solving problems and describing temporal relations among variables via mixture graph model, we achieve more accurate and higher temporal consistency results on pixel resolution.The main content and contributions of this dissertation are the following four points.Firstly, this dissertation presents a heterogeneous data fusion based online road scene object-level segmentation method. Current state-of-the-art methods of semantic understanding fuse heterogeneous data on feature level or decision level and can not achieve pixel resolution results. Our method not only fuses heterogeneous data on both feature level and decision level, but also achieves pixel resolution result. On the feature level, heterogeneous data is fused through a depth upsampling method. On the decision level, there are two strategies for data fusion. On the one hand, online object detection and object-level segmentation are achieved via generating seeds of object hypothesis from 3D lidar points. On the other hand, based on the seeds, hard constraints are added to the graph model, which improve the segmentation result.Secondly, this dissertation presents a method for jointly solving label variables with discrete values and continuous values. We use alternating direction method (ADM) to solve the joint problem. Specifically, the variables with continuous values can be solved via a linear optimization method. This joint method is used to solve the joint road scene object-level segmentation and depth upsampling, which avoids the drawbacks of current state-of-art methods on computation and precision.Thirdly, this dissertation presents a method for heterogeneous data fusion based joint road scene object-level segmentation and semantic labeling problem. In the data fusion on decision level, we obtain object detection bounding box through 3D object hypothesis, which avoids searching on the whole image. Different from the joint models constructed in current state-of-art methods, our model satisfies the sub-module constraint and can be solved by Graph Cuts.Finally, this dissertation presents a mixture graph, which contains simple edges and hyperedges. In this model, simple edges are use to describe the pairwise spatial relationship between label variables and hyperedges are used to describe the high-order and asymmetric temporal relationship within label variables. The model is used to solve temporal semantic road scene labeling problem, which has better ability of modeling than simple graph and avoids solving high-order potentials of simple graph model. Compared to traditional simple graph based method, the mixture model efficiently improves the accuracy and consistency of temporal semantic road scene labeling.
Keywords/Search Tags:semantic road scene understanding, heterogeneous data fusion, joint modeling, Markov random field, probability graphical model and hypergraph
PDF Full Text Request
Related items