Font Size: a A A

A guided, low-latency, and relevance propagation framework for interactive multimedia search

Posted on:2011-02-23Degree:Ph.DType:Thesis
University:Columbia UniversityCandidate:Zavesky, EricFull Text:PDF
GTID:2448390002451659Subject:Engineering
Abstract/Summary:
This thesis investigates a number of problems associated with the efficient and engaging ways of executing a multi-level interactive multimedia search. These problems are of interest as the availability of multimedia sources, both professional and personal, continues to grow in tandeom with the need for users to search these libraries for consumable entertainment, captured personal memories, and automatically events with little or no forethought to manual indexing.;The goal of this thesis is to develop a framework that both guides the user through his or her search process by providing dynamic suggestions and information from automatic algorithms while simultaneously leveraging cues observed during the search process to provide a customized set of results that most precisely matches the user's search target. Upon achieving this goal, the system is aiding the user through both explicit interaction and subsequent result personalization from implicit search choices. A prototype of the proposed system, called CuZero, has been implemented and evaluated across multiple challenging databases to discover new search techniques previously unavailable.;Addressing problems in traditional query formulation, a system that interactively guides the user is proposed. While previous works allow a user to specify different modalities for a multimedia search like textual keywords and image examples, this work also introduces a large library of 374 semantic concepts. Semantic concepts use pre-trained visual models to bridge the gap in perception between what a machine computes for a multimedia document and what a user can do with that computation. For example, a user need only utilize the concept "crowd" to return content containing large numbers of people attending a basketball tournament, a political protest, or an exclusive fashion show. Building on the familiar technique of text entry (typing in text keywords), the system returns a small subset of dynamically suggested concepts from a lexical mapping and statistical expansion of the user's entered text. These suggestions both engage and inform the user about what the system has indexed with respect to the current query text. Additionally, the introduction of a unique query visualization panel allows the user to interactively include arbitrary modalities (text, images, concepts, etc.) in his or her query.;After a query is formulated during a guided and informative process, the formulation panel is subsequently utilized for query navigation, allowing the user to instantly review numerous query permutations with no perceived latency. With the intuitive mantra "closer to something is more like it", the user is prepared to instantly change the weights of the various parameters in his or her query. To accommodate this flexibility, previous systems in interactive search resorted to burdening the user with a secondary query specification stage to tweak individual modality weights. However, the proposed approach to result browsing allows the user to navigate the query and result space in parallel, spanning a wide breadth of query permutations or a deep result depth for any one query permutation. Another classic barrier in multimedia search is the sensible inclusion of new search modalities; if no longer constrained to color or text cues, how can one include motion, audio, and local object similarity that has no textual correspondence? Fortunately, the proposed query navigation panel was created in such a way that any modalities developed in the future can be included with no additional algorithmic changes. This flexibility is best exemplified during the result browsing process, where a user can include another image for example-based search or a personalized snapshot of seen results into the query to quickly hone in on desirable results.;A final proposal in this work is a scalable and real-time result personalization technique. In this work, state-of-the-art graph-based label propagation is aided by data approximation techniques, in a proposed algorithm that is able to achieve higher accuracy in only a small fraction of the computation time when evaluated on a standard benchmark dataset. Using the real-time implementation of this technique, user search results can be personalized without the need to solicit result preferences en mass. (Abstract shortened by UMI.)...
Keywords/Search Tags:Search, Multimedia, User, Interactive, Query, Result, Work
Related items