| Stereo vision is an important branch of computer vision, where 3D scene understanding technology is an important area. This dissertation mainly focuses on scene understanding for railway video. It recovers the relative position of objects in the scene space which can be used to describe the railway environment more vivid. What’s more, the technology has played an important role in railway panorama,3D navigation, maintenance of railway transportation safety and so on.Our contribution is as follows.1. We aim to construct the spatial layout of railway scene from railway video. Firstly, we automatically get video space scene layout by analyzing the correspondence relationship between 3D and 2D, the camera imaging principle, various coordinate systems and three-dimensional geometric principle. Secondly, based on the construction process from the line to the surface and from the surface to the body, we obtain the three-dimensional distribution of the scene. Finally, we apply the 3D scene layout to generate panorama image, which automatically segment the four region of the railway video and can further generate the panorama images for railway video.2. We develop the depth estimation algorithm for railway video and build a 3D environment video. There are two methods to obtain a depth map, one is based on the hardware sensor. This method is simple and accurate, however, it is limited by power and wires. Another approach is to use a semi-interactive software to generate a depth map, which is not restricted by the region, but it requires the user’s intervention in the process. In order to obtain depth information of the video, we use a convenient and fast algorithm, which combines above method based on the image feature matching. Experimental results show that compared with the MRF method, the depth map obtained by this dissertation is more accurate. Furthermore, compared with the Depth Transfer method, the estimated results are very close. However, Depth Transfer needs to use the Kinect to get RGBD prior data set. We overcome the limitations of the Depth Transfer and use the DMAG to collect the RGBD data set so that we get rid of the limitations of the Kinect shooting site which can get the depth value of any image. |