Font Size: a A A

A Fast Face Retrieval And Recognition Method Based On DCT Compressed Domain

Posted on:2008-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:H LiuFull Text:PDF
GTID:2178360212496614Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
IntroductionAt present, almost all digital images are stored in compressed formats, among which the format defined by joint picture expert group (JPEG) is widely used on Internet or image databases. To do the feature detection or extraction for those compressed images, the conventional approaches need to decode the images to the pixel domain first, before carrying on with other existing image processing and analysis techniques. This is not only time consuming, but also computationally expensive. Yet it becomes more and more important to improve the efficiency of indexing and retrieving compressed images. Therefore, a new wave of research efforts are directed to feature extraction in compressed domain. As the inverse DCT (IDCT) is an embedded part of the JPEG decoder, and DCT itself is one of the best filters for the feature extraction, working in DCT domain directly remains to be the most promising area for compressed image processing and retrieval. In addition, DCT also preserves a set of good properties such as energy compacting, and image data decorrelation. Thus, direct feature extraction from DCT domain could provide better solutions in characterizing the image content, apart from its advantage of eliminating any necessity of decomposing the image and detecting its features in pixel domain.Formulas and DefinitionsAs the mean and the variance provides strong indications of how uniform or regular a region is, the two parameters play important roles in texture analysis across a wide range of image processing and its applications. In fact, in consideration of merging regions, m1 and varianceμ2 are employed to determine whether the growing regions are still homogenous or not.Given a block of N×Npixels inside an image, the intensity valuescan be represented as X = x(i ,j)(i ,j= 0,N-1)If X is regarded as a random variable, the k th moment of X X on this block can be defined, which is given below Further, the k th central moment of X , is defined in pixel domain as follows:In statistics, the first-order moment m1 and the second-order central momentμ2 are referred to as mean m1 and varianceμ2 of the pixels inside this block. From their definitions given in Eqs (1) and (2), it is seen that m1 represents the average intensity of the block, andμ2 the difference of intensity among the pixels inside this block. In addition,μ2 and m1 satisfies the following relationship: To calculate the two parameters directly in DCT domain, we start form the definition of two-dimensional DCT inside this block, which is given below(4) Since We have m1 = 1/NF(0,0) (5) Compare the relationship between the DCT domains, we can classify them, and with this procedure, we can separate the original picture in different regions.Definition of NMISuppose the M×N points on a digital gray scale picture f ( M,N) is the M×N particles on a plane XOY , the gray scale of each point f (i ,j) is the mass value of the correspond particle. For each picture, we have the definition below.Definition 1The sum of the gray scale of all points, on a digital picture is defined as the mass of this picture, and sum is m, so ,Definition 2Center of gravity of the picture, represent by (i ,j)so,Definition 3Suppose the picture turn around with a point (i0 ,j0), we write Ji0 ,j0 as the inertia, so that we give the definition below.By use some kind of threshold trans method, we can give every point on the picture a new value, zero or one, and get a new two-values digital picture.Definition 4By use the definition of the mass, the center of gravity and the inertia,we have a the Normalized moment of inertia feature of a two values picture as below Get the DCT coefficientWe suppose the face database is stored in JPEG format ,so that every picture in the face database can provide some DCT coefficients. We can get by use the IDCT method. Especially, we think the main information can get from the Y code of DCT coefficient.Otherwise, if the picture is stored in DIB format, we can use DCT method at first place, to get the coefficients. Classify the DCT coefficient blockBecause the characteristic of the JPEG format, the DCT coefficient block along the zigzag scanning route, always approach zero and are often ignored during the compression process. We use the mean and the variance value to classify the DCT block Suppose the8×8 block is F (μ,ν),the mean is m, so m = 1/8F(0,0) (10) The variance isαSo, m andαform a two dimension spaceSeparate intervals in k, separateαin l intervals. so, ach block should be classified in one of the k×lclasses.NMI characteristic sequenceBy the classification, we transform the original M×N picture in a k×l picture, we call this picture the index picture. Furthermore, if a block is belongs to a class, we set the value as one, otherwise we set the value as zero. In this way, we transform the index picture in k×l new pictures, each picture only has two values, as figureCalculate the NMI coefficient for each picture, we get a k×l sequenceWe use this sequence represent the characteristic of original picture Compare the sequenceWe use the distance definition like Application method use to face recognition and retrieval .We use the ORL face database.Transform every picture in the database, and get the result like below, and get the coefficient.Estimate the final results of the experiment, we approach the target, recognition rate is72.6% and 76.59%.The characteristic of the methodFirst place, the method use the DCT coefficient as the characteristic of the picture, so that we don't need to use the process of IDCT calculate. At the same time, we use the distance between two sequences to recognize the face. So, we can save the time of calculation. Second, the method form the sequence for each picture alone, there is no relationship between different pictures, so that we can add new sequences to the characteristic database easily.Third, the recognition rate is acceptable. Furthermore, if the result is not correct, we can still find the closest result to the correct one.
Keywords/Search Tags:Recognition
PDF Full Text Request
Related items