Indexing and mining multimedia databases

Posted on:1999-10-16

Degree:Ph.D

Type:Thesis

University:University of Maryland, College Park

Candidate:Korn, Philip Russell (Flip)

Full Text:PDF

GTID:2468390014470988

Subject:Computer Science

Abstract/Summary:

In this thesis we examine methods for two related problems: (a) indexing multimedia databases for fast searching, and (b) mining data from a multimedia database to extract patterns, correlations, rules and outliers.; We first investigate the problem of retrieving similar shapes from a large database; in particular, we focus on medical tumor shapes ('Find tumors that are similar to a given pattern.'). We use a natural similarity function for shape matching based on concepts from mathematical morphology and show how it can be lower-bounded by a set of shape features for safely pruning candidates, thus giving fast and correct output. These features can be organized in a spatial access method, leading to fast indexing for range queries and nearest neighbor queries. In addition to the lower-bounding, our second contribution is the design of a fast algorithm for nearest neighbor search, achieving significant speedup while provably guaranteeing correctness. Our experiments demonstrate that roughly 90% of the candidates can be pruned using these techniques, resulting in up to 27 times better performance compared to sequential scan.; The second problem is data mining. We propose methods for efficiently finding 'interesting things' from large data sets in the form of summaries, outliers, and rules, based on powerful tools from statistics. Our first goal was to compress a very large data set in a manner that efficiently supports ad hoc queries, as is often desired for data mining. Compressed data is notoriously difficult to access randomly. We show how this can be done in three passes, provided that a small amount of lossiness can be tolerated. The proposed method, called SVDD, achieves an astounding 0.5% reconstruction error with a space requirement under 2% (i.e., a 50:1 compression ratio).; A good data compression method naturally leads to data mining. All compression methods operate by detecting patterns, or rules, which are exploited to reduce redundancy. In the last part of this thesis, we consider the problem of discovering, interpreting, and using such rules, given data organized in a matrix of, e.g., customers x products. We propose a new paradigm, namely, Ratio Rules, which are quantifiable in that one can measure the ability of the rules to recover values that are pretended to be unknown; this is in contrast to the well known association rules. Another contribution is a novel method to guess missing values from the rules. (Abstract shortened by UMI.)...

Keywords/Search Tags:

Data, Mining, Indexing, Multimedia, Rules, Method, Fast

Related items

1	Research And Application Of Algorithm Of Fast Mining Association Rules In DM
2	The Research And Application Of Data Mining In Mining Rules Of Medical Diagnosis
3	The Research On The Fast Class Association Rules Algorithm
4	Conditioning And Fast Three Traversal-based Association Rules Mining Algorithm, Based On Space-time
5	Web-based Multimedia Data Mining Research
6	Research Of Method On Mining Valid Association Rules
7	Research On Association Rules Mining In Data Streams And Its Application
8	Research On Data Mining Based Decision Rules And Association Rules
9	Research On Mining Technology Of Association Rules And Meta-Rules
10	Research Of Data Mining Based On Relational Rules