Font Size: a A A

Multimedia Retrieval And Classification Research Based On Invariant Features

Posted on:2013-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2248330362961834Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years, it is impossible for us to use the simple features to realize the invariance of images in conditions of affine, zoom, pan and so on. The simple features are unable to satisfy the needs of us. Image invariant features have received wide attention for the invariant of affine, zoo, and pan. By using image invariant features, we can significantly improve and accelerate the existing methods of image recognition. It is gradually becoming the mainstream of the content analysis and understanding, and widely used in video retrieval, image classification, target identification and other areas.We focus on the application of image invariant features in video retrieval and image classification, and propose an algorithm based on image invariant features for each of them. Research points and the corresponding results are as follows:(1) Face retrieval in the video. We propose a method used in face retrieval-FRIVAP. It can achieve good result in the condition of noise and pose variations. First, we get separate sets of face images for different people, and then extract the pseudo-zernike moments. Second, we generate affine hulls for the features of the image sets. Finally, we get the most similar result by calculating the distance between the query face and affine hulls. The experiments in Honda video database and FRID database show that the algorithm not only using the information of the time and spatial domain, but also having good performance in conditions of noise and pose variations.(2) Image classification. We propose a method which is based on object region and BOW. The main idea of the method is as follows: First, we detect and segment the object regions of image. Second, we use dense sampling to extract the features, which get a 128-dimensional features vector of the image. We strengthen the features of the object region, and then we get the object enhanced features. We get the codebook of the images by K-means method and generate the BOW of the image. Then we decompose the spatial domain of the image by the pyramid segmentation method, and then we achieve the spatial histograms of the image. Finally, we apply SVM to realize the image classification.
Keywords/Search Tags:Image retrieval, Image classification, Bag of Visual Words, Pseudo-zernike moments
PDF Full Text Request
Related items