Font Size: a A A

Research On The Three-Dimensional Model Reconstruction Of Objects Based On The Time-of-Flight Depth Sensor

Posted on:2012-03-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:G Y MuFull Text:PDF
GTID:1118330335453030Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Three-dimensional model reconstruction of a rigid object is very useful in many applications such as engineering, medicine, biochemistry, animation, and so on. It is still an active, yet challenging research field which can be categorized into two major classes by their acquisition approaches:passive and active sensing.Passive approaches do not interact with the object, whereas active methods make contact with the object or project some kind of energy onto it. The computer vision community mainly focuses on the passive approaches that extract range information from one or multiple images. Those approaches include shape-from-shading for single images, stereoscopic vision for a pair of images and factorization methods for video streams. Multi-view stereo is the common method to reconstruct a complete 3D object model from a collection of images taken from different views. Over the last few years, a number of high-quality algorithms have been developed. The survey paper presented a comprehensive comparison and evaluation of existing multi-view stereo reconstruction algorithms. Achieving high quality, passive methods are also cheaper and easier than active methods in term of capturing data, as they require very little special purpose hardware. However, they typically do not yield highly accurate digitizations required by a number of applications.More accurate and robust results can be obtained by active methods with fast and precise range scanners, owing to advances in the areas of physics and electrical engineering, including the development of lasers, CCD's, and high speed sampling and timing circuitry. The most popular range-finding devices are based on the principal of optical triangulation. Such technologies allow us to take detailed shape measurements with accuracy up to 0.05mm. Despite its high accuracy, these kinds of range scanners are mainly used for manufacture, industrial and research purpose. The expense, portability and operability prevent their use in the wider field such as home-based applications. Moreover, common laser scanners are capable of scanning a line at one time, and it may take up to hours to acquire a surface. This limits their ability to reconstruct only static subjects or scenes.The main research contents are as following:1. In this paper, we propose to use time-of-flight (TOF) depth sensor to reconstruct complete 3D shapes of objects. Our model reconstruction system contains a point-grey dragonfly camera and a time-of-flight (TOF) depth sensor. TOF sensor estimates depth by emitting light pulse and measuring the time that light pulse travels back to the sensor. The distance is proportional to the traveling time of the light pulse, so called time of flight. As an active range sensor, TOF depth sensor overcomes the drawbacks of the laser scanner. Rather than capture a surface line by line, it is able to capture the whole surface of an object at video frame rate, enabling it to deal with dynamic subjects. On the other hand, it does not suffer from the defect of passive methods, producing more robust and accurate range data. Smaller than a home use camcorder, it can be rotated around a subject the same way as we shoot the object with a camcorder, except that the output is the complete 3D model rather than an ordinary video sequence. If combined with a real-time surface alignment algorithm, real-time 3D modeling can be achieved, given its real-time performance of capturing range data.2. Initial alignment either feature tracking or the 4-points congruent sets algorithm is used to align surfaces captured at different frames. After mesh generation, we align different surfaces. For featureless object, we use the 4-points congruent sets (4PCS) algorithm to roughly align neighboring surfaces.4PCS also estimate transformation by finding congruent coplanar 4-points sets in both surfaces. If the object has enough features, we roughly align consecutive surface meshes by feature tracking in the initial alignment. We use scale-invariant feature transform (SIFT) to track features between consecutive surfaces. The features are local and based on the appearance of the object at particular interest points, and are invariant to image scale and rotation.2D feature correspondences are mapped to the 3D point correspondences by the calibration between the regular camera and the depth sensor. We compute rotation and translation from the 3D correspondences by absolute orientation combined with RANSAC algorithm in order to handle outliers of the SIFT matching.3. The iterative closest point (ICP) method is applied to further align the piecewise surfaces from the result of initial alignment. Given the rough alignment from last step as initial guess, we use the iterative closest point (ICP) algorithm in a global manner to further stitch different surfaces. After initial alignment, surfaces are roughly aligned, but the alignments are not still accurate. In general, ICP aligns two surfaces by estimating transformation from correspondences, which is obtained by looking for the closest points. Using the ICP algorithm on large number of range images may lead to accumulative error, which can result in incomplete models. Therefore, we apply ICP in a global manner so that the alignment error is evenly distributed across the available pairs of surfaces. We first align scans pairwise with each other which provide us with a set of constraints. Those constraints are represented by the point pairs between two neighboring surfaces generated from the pairwise registrations. Instead of using concrete pairs, we adopt the virtual pairs approach proposed by Pulli, in favor of moving each scan relative to its neighbors as little as possible. The global alignment is performed incrementally by adding surfaces into a set of consistently aligned surfaces one at a time while keeping the views in that set consistently aligned.4. In this paper, we work on merging surfaces and filling holes. Range surfaces are merged into a whole 3D model by the volumetric method. We first use the scan-converting a surface into signed distance function and combine the simple data acquired. We extract the isosurface from the volume grid. Under certain assumptions, the isosurface is best optimal in the least squares sense. Scan-converting a surface into signed distance function defined on the 3D grid. Relative to the viewing direction, those voxels in front of the surface are assigned negative values and those behind the surface are assigned positive values. Larger absolute values are assigned to voxels further away from the surface, and the surface lies on the zero iso-surface. In the three-dimensional space, the algorithm is as follows:First, we will set all voxel weights to zero, so that the new data will overwrite the initial grid. Second, we construct triangles by the nearest of sampling grid in order to tessellate each range image. We discard the triangle with edges longer than a threshold value to avoid the discontinuous steps and calculate each vertex weights.In the refinement step, we fill holes and produce a complete 3D model that approximates the original model with robust repair of polygonal models.5. In this paper, we do a lot of comparison work. First, we use the FARO Laser ScanArm V2 to scan the object and take the output results as the ground truth. We compare our results with the ground truth by aligning them up by ICP algorithm and computing the relative average distance. From the experiment results, we can see that the reconstruction errors of our system are less than 1%. Second, we also integrated Zcam into our system and reconstructed the 3D model of the frog except for the SR3000. Numerical errors show that SR3000 is better than Zcam, due to better depth resolution and noise control.This paper describes a complete 3D model reconstruction system with TOF depth sensor, which is more suitable for home based application due to its real-time performance, portability and relatively high quality. In contrast to previous expensive and dedicate hardware, our system only uses the combination of a time-of-flight depth sensor and a regular camera, making it more affordable by a wider users. Plus, its ability to capture dynamic objects in real time is a big advantage over laser scanners. We did not acquire the ground truth of the people's head, because people can not keep their head absolutely still during the scanning. This is the drawback of laser scanners we mentioned before. Our system easily solved this problem by simply asking the person to make a 360 rotation in front of our capture device.
Keywords/Search Tags:3D Model, 3D Reconstruction, Time of Flight Depth Sensor, Range Image, Alignment, Range Image Integration, ICP algorithm, Merge Range Data, Fill Hole
PDF Full Text Request
Related items