Font Size: a A A

Database support for queries by image content

Posted on:2003-02-17Degree:Ph.DType:Thesis
University:The University of Wisconsin - MadisonCandidate:Shaft, UriFull Text:PDF
GTID:2468390011480406Subject:Computer Science
Abstract/Summary:
This thesis contributes answers to three interesting questions. These questions illustrate the fundamental difficulties that images add to traditional database management systems.; What makes a general purpose image database management system (IDBMS) so difficult to develop? A general purpose IDBMS should support storage of large collections of images and their associated features, as well as efficient searches over the image features. The storage and retrieval of large volumes of data is a traditional database research problem. However, the types of features we can associate with images is a traditional computer vision problem. Our contribution is the PiQ IDBMS, which bridges the two disciplines. We developed an image data model and integrated it with an automatic, extensible feature extraction framework that organizes extracted features into database tables. PiQ's extensibility allows it to utilize state-of-the-art feature extraction algorithms, and it leverages the DBMS for robustness and scalability.; When is nearest neighbors meaningful? Many image similarity measures proposed in the literature exhibit poor performance; images retrieved as being similar to a query image often seem to be randomly drawn from the database. Image similarity is usually mapped to the nearest neighbors problem. We introduce the notion of nearest neighbors “instability” as an indicator of a poor similarity measure. Our contribution is a theoretical result giving simple conditions under which nearest neighbors workloads become unstable. This result can aid in developing meaningful similarity measures.; When is nearest neighbors indexable? There are many publications about multidimensional indexing structures, designed for solving the nearest neighbors problem. These indexing structures invariably fail as the dimensionality of the workloads increases. This phenomenon is sometimes called “the curse of dimensionality”. We present a theoretical result that explains why these indexing structures do not scale with dimensionality. We prove that the nearest neighbors workloads commonly used in the literature can not be indexed by a wide class of multidimensional indexing structures, including all R-Tree variants, the SS-Tree, SR-Tree and others.
Keywords/Search Tags:Image, Database, Indexing structures, Nearest neighbors
Related items