Font Size: a A A

Researches And Applications Of Image Understanding Based On Structure And Appearance Models

Posted on:2009-01-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:M LiuFull Text:PDF
GTID:1118360272976434Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Abstract Contents:Nowadays, the perception ability and process capability of human brains have been increasingly enhanced and broadened with the amazingly rapid development of computer technique. Computer vision, of which the ultimate aim is to endow computers with the abilities of human vision, has its advantages and disadvantages compared with human vision. Computer vision has a close relationship with many fields, such as image processing and analysis, machine vision, pattern recognition and so on. Computer vision has two processing phases, low-level and high-level. In most cases, image understanding, which means the high-level computer vision, is considered equally as computer vision in a narrow sense.The methods of image understanding can be categorized into'data-driven'and'model-based'. In'data-driven'methods, the image data is examined at a low level, looking for local structures such as edges or regions, which are assembled into groups in an attempt to identify objects of interest. Without a global model of what to expect, this approach is difficult and prone to failure. Compared with the former,'model-based'methods can use prior knowledge of the problem, in principle, such as the expected shapes, structures, their spatial relationships, their grey-level appearance or other image features to interpret high-level information, which include the scenario and objects, plausibly and overcome the difficulties while using the computational and mathematical representations.'Model-based'methods have been performing well in the field of image understanding in the last decades. In terms of objects of interest and methodologies, there are rigid models and deformable models, two-dimensional models and three-dimensional models, point distribution models and function fitting models, and so on. In real world, there are rigid and non-rigid objects. Non-rigid objects are also called deformable objects since the spatial relationships among their sub-parts may change sometimes. So, compared with rigid objects, modeling non-rigid objects is more difficult. Whatever objects to be modeled, there are two main characteristics we would like such models to possess. First, they should be general– that is, they should be capable of generating any plausible example of the class they represent. Second, and crucially, they should be specific– that is, they should only be capable of generating'legal'examples.There are many substantial literatures describing the use of deformable models. Active contour models, also called'snakes', deform spline curves elastically to fit shape contours.'Hand crafted'models, build up a whole model of object using combinations of parameterized circles, arcs, lines and etc. Articulated models build models from rigid components connected by sliding or rotating joints. Fourier models can represent shapes by an expansion of trigonometric functions, and by varying the parameters and the number of terms used, different shapes can be generated. Statistical shape models analyze and model the placement and other information of landmarks. Finite element models allow us to take a single shape, treat it as if it were made of an elastic material and give a set of linear deformations of the shape equivalent to the resonant modes of vibration of the original shape. Active shape models and active appearance models model objects with landmarks located at boundries, and the texture along the profile normal to boundary or across the target object.In this paper, we present a global structure constraint model (GSC) based on the associative ability of human being and the assumption that the visual information is continuous. The key idea of GSC is to represent the global structures of target objects with a set of landmarks and their textures, which makes it belong to point distribution models. During modeling phase, we should define the placement of landmarks. Select several essential areas with small texture variations and choose suitable landmarks for all area individually to denote them. Label landmarks on all training images manually or semi-automatically and obtain the studying set of landmarks and textures. During fitting phase, define the transformation matrix, by designating the meanings, number and ranges of parameters, and metric function between model and the instances of model on the target image. Fit the model to target images with an optimal strategy searching in the parameter space. GSC model is of several properties, which are reserving the global structure information of objects with small amount of landmarks, modeling the deformation by adjusting parameters and measuring the relationship between model and instance in combination of color information and mutual information.Active shape models and active appearance models are very popular and based on point distribution models. They all define landmarks on the boundaries with strong texture changes. Given a starting position, they can search locally around it and obtain the details descriptions. On the contrary, GSC discards the edges and details for effectively global searching. We propose a new model based on these two ideas to get a rapid global localization and a local detail search, which avoids the disadvantage of active appearance model, failing to converge to the correct solution given a poor starting position.Biometric identification technology is becoming more and more important in our daily lives. At the same time, face recognition is one of the most popular subjects because it characteristics, such as straightforward, friendly, convenient, facile and unconscious etc. We tested the models proposed in this paper at the phase of face localization using the FERET database, compared the genetic algorithm and particle swarm optimization algorithm and proved the effectiveness of models.CAPTCHA, which means'Completely Automated Public Turing test to tell Computers and Human Apart', can be used to protect websites against bots by distorting text which humans can recognize but computer programs can not. We applied the new models in this field and achieved some meaningful results.In the future, we look forward to improve these models in terms of optimal algorithms and transformations, and apply them in the popular medical image processing. Furthermore, two-dimensional images are a kind of projections of the three-dimensional world we live in. We will also focus on three-dimensional data processing and object modeling.
Keywords/Search Tags:Image Understanding, Shape, Structure, Appearance, Active Appearance Model, CAPTCHA, Face Recognition
PDF Full Text Request
Related items