Many problems in multimedia document analysis and search can be modeled as interactive prediction of semantic information. As a result, we can broadly decompose research into this area into the three critical system components: Interaction Strategies, Search/Learning Strategies and Representation. This dissertation seeks to make contributions to each of these components for image search, image annotation and multimodal document analysis applications scenarios. Several new techniques for search, interaction and representation are presented with empirical evaluation. Finally, an overview of contributions and thoughts for future work in this field are given. |