Font Size: a A A

Research On Image Matching Based On Graph Cut

Posted on:2008-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:L L LiuFull Text:PDF
GTID:2178360212496298Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The main contents of this thesis are energy minimization based on graph cuts and its application to image matching.The goal of computer vision is to make the computer have the ability of gaining 3D information from 2D images. In recent years, as a powerful mathematic tool, graph method is introduced to the research of computer vision problems. Graph methods and the algorithms based on it have many applications, such as object tracking, image rectification, image segmentation, image restoration, stereo reconstruction, augmented reality, texture synthesis and so on. In reality, many correlative problems in computer vision can be concluded as the energy minimization problems. As for the energy minimization problems, people often use the traditional optimizing technology, such as gradient descent for consecutive variables and simulated annealing for discrete variables. But these methods can always be trapped into local minima or take very long time to converge. The energy minimization based on graph cuts theory is more robust, more real, and trustier than the traditional methods, and it can get a global minimum or a local minimum in very strong sense. The graph cut has very broad uses, such as image segmentation, image restoration, object occupancy, texture synthesis and so on.In the method of graph cut, we use a very special graph, namely the graph with two terminals. To minimize the energy functions, we need to construct the graph. In fact, if the energy function is in a certain form, we can construct a graph in our method. The construction of the graph is in a fixed form, so we needn't construct graphs for every problem especially, and it is very easy in practical applications. After we get the graph corresponding to the energy function, we only need to do some operation on the graph. The operation is using the max-flow/min-cut theory of Ford-Fulkerson. Namely, the max flow of the graph is equal to the capacity of the min cut, and the min cut problem of the graph can be computed via max flow.3D reconstruction is a very important issue in computer vision, and image matching is crucial to the reconstruction technology. We divide the matching algorithms into disperse matching and dense matching. Disperse matching can also be called characteristic matching, and it often uses the zero point, edge figure and line as the matching unit. It only matches the characteristic areas of the two images. It is very fast, and can serves for special cases. As the disparity is very sparse, the result cannot serve for the need of 3D reconstruction. The dense matching is much more precise, and the density of disparity is larger. So it can be used for 3D reconstruction. Its typical arithmetic is area correlative matching, and it uses the gray value or other character as the matching unit. We apply the graph cut into image matching, and put forward a new image matching arithmetic based on graph cut. This method is the dense disparity matching. It views the disparities as labels, and constructs the energy function. So the matching problem is converted to the energy minimization problem. By construction the graph, the energy is corresponded to the capacity of the cut. Finally we use the flow theory of the graph to get the minima of the energy, and gain the disparity of the image matching.Before matching the two images, we need to rectify them. After the rectification, the corresponding pixels in the two images are in the same scan line, i.e. they have the same Y coordinate and the different X coordinate. The difference is called disparity. In this sense, the 2D matching is converted to 1D matching. We use the 1D matching arithmetic based on graph cut, and the scope of search is greatly decreased. To represent the result of the image matching, we use the disparity map at present. The disparity map uses the coordinates and gray values to show us the 3D information. We map the disparities to 0-255. If the disparity is larger, it is closer to 255. If the object is closer to the camera, its disparity is larger; if the object is farther from the camera, its disparity is smaller.The restriction condition of the two images matching includes photo consistency constraint, smoothness constraint and visibility constraint. We get our energy function from these restriction conditions. The energy function satisfies the regularity constraint. Then we use graph cut to minimize this energy function, i.e. we construct the graph for the function to get the min- cut. We reach the aim of image matching by the above two steps. More precisely, we construct a graph with two terminals based on the energy function, and set reasonable weights for the edges. Then the min cut on the graph corresponds to the minimum of our energy function. In the process of matching we use theαexpansion method: we select a disparityα(the choice ofαcan be in a fixed order or random), and find the unique configuration after theαexpansion. If this configuration can decrease the energy, we define it as the current configuration. If there is noαexpansion can decrease the energy, the arithmetic is over. We do the above operations for the disparities in the permitted range and finally get the result. The crucial part of the arithmetic is computing theαexpansion with the smallest energy.As for the multi-scene, the restriction condition is more complex, so we need to add some terms to the original energy function. We can prove that the new energy function can also serve for the regularity restriction, namely that it can be represented by graph. The new function is very complex, and its result is not very good in the practical application. So we dispose the two smooth terms of new function separately and get two simplified arithmetics.
Keywords/Search Tags:Research
PDF Full Text Request
Related items