Font Size: a A A

Study Of A Image Retrieval System Based On Clustering Index

Posted on:2005-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:P Z ZhangFull Text:PDF
GTID:2168360125950832Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Content-based image retrieval (CBIR) system is different from traditional retrieval system based on text mode. Index of image is described by some visual features but not by text, such as, color, texture, shape, and locality feature ect. Users always search similar image according to an example image. Result of retrieval can be feedback with ranked similarity degree, and similarity degree is computed from distance function in feature vector space. Generally, image features are described as high-dimensional space points or vectors, the distances between vectors reflect the similarity of the object images. So that content-based image retrieval can be predigest as searching for the nearest vectors fast. Speed is an important issue in image retrieval system. Generally, image information is abundantly, and size of image in the database is large. So if images are searched SSA, computation is enormous and time-consuming. Indexing methods have been studied broadly as a useful support tool to enhance retrieval speed. The pioneer system is IBM's QBIC system,It use Karhunen-Loeve transformation (KLT) to realize dimension reducing, and the then construct index with R* tree algorithm. Visual SEEK system developed by Columbia university, this system construct index with binary tree algorithm and retrieval based on spatial information. Some others indexing algorithm include Quard tree, k-d tree, Grid tree, multi-binary tree and cell indexing algorithm, ect. These early indexing algorithms are easy but not suit for constructing the index of multimedia system. On the present, the popular indexing algorithms are R tree, linearity Quard- tree, and Grid file. Thereinto, R tree families as R* tree, R+ tree, SR tree, and SS tree are the most effective indexing methods. But with these methods the speed performance is bad and even go near to SSA. So these methods to be used as an effective index in image retrieval system, feature dimension space must be decrease lower than 20, but a lot of useful information will be lost if compact feature dimension blindness. In addition, these indexing methods only support Euclidean distance for similarity retrieval.The indexing methods above are adapted to construct the index of image databases, but haven't refer to non- Euclidean distance to measure similarity of images. So indexing technique based on clustering is proposed. This technology has a dynamic structure ,it can deal with high-dimensional data and support non- Euclidean distance retrieval. At the present, content-based indexing techniques are most concentrate on the algorithm study. The classical algorithms are k-means, ISODATA, maxima tree, dynamic clusering, and adaptive algorithm, ect.K-means is used as a clustering algorithm broadly. Of the many methods available for clustering feature vectors, the K-means algorithm is the quickest and simplest. This algorithm can process great data volume effectively. Despite being used in a wide application, the K-means algorithm is not exempt of drawbacks as follows: It is dependent upon the order of the feature vectors and requires number of clusters K to be given. Obviously, it is not necessarily practically in real-world applications. Fuzzy C-means(FCM) algorithm overcomes drawbacks above, but in this algorithm initial centroids are defined randomly, this lead to the result of clustering unstable, especially when there is a great many images in the database. In this paper, we propose a modified fuzzy C-means (MFCM) scheme, this scheme illustrates how to get reasonable initial centroids according to the real-line images feature vectors, further more, it has capability for us to assign, eliminate, split, unit, merge, and insert feature vector dynamically. Experiments show that Using MFCM scheme, time of image retrieval won't increase linearly with the size of image database increasing. MFCM can be used in image retrieval system effectively.High-dimensional feature is the main item affect speed of clustering and retrieval. So dimension reducing is another issue in t...
Keywords/Search Tags:content-based image retrieval(CBIR), clustering index, dimension reducing, Fuzzy C-means (FCM), modified Fuzzy C-means(MFCM), FastMap projection algorithm(FMPA), Karhunen-Loeve transformation(KLT), similarity measurement, weight
PDF Full Text Request
Related items