| With the development of multimedia and Internet technology, the volume of digital images is growing rapidly. Subsequently, Content-Based Image Retrieval (CBIR) has been drawn substantial attention in the last decades. Also, CBIR plays a very crucial role in many domains, such as military affairs, biomedicine, information security, remote sensing, and art imaging, etc. However, the gap between low-level visual features and high-level semantic concepts usually leads to poor retrieval performance. Interactive semantic inference, as a powerful technique for bridging this gap, focuses on mining the semantic cue form the human-computer interaction. This dissertation provides a comprehensive survey of semantic inference techniques in the literatures, and the important aspects in this domain are discussed. Specifically,3 problems standing in need of solutions are investigated, which includes enhancing the generalization ability of inference system with only few training examples, developing efficient solution for tackling the asymmetric distribution between positive and negative classes, and combining shot-term and long-term learning for image retrieval. The main contributions made by this dissertation are summarized as follow:Firstly, SVM active learning within biased semi-supervised boosting framework, SA2S2 for short, is proposed. This algorithm aims at enhancing the generalization ability of learning system by integrating the merits of semi-supervised learning, ensemble learning and active learning, so named hybrid learning paradigm. Moreover, a bias-weighting mechanism is developed to guide the ensemble model to pay more attention on relevant images than irrelevant images. Experimental results show that the proposed hybrid learning and biased ensemble strategies are effective to improve CBIR performance.Secondly, SVM active learning based on semi-supervised ensemble with bias, (SE)2A for short, is proposed. Similar to SA2S2, hybrid learning and biased ensemble strategies are also employed by (SE)2A. However, by analyzing the asymmetric distribution between relevant and irrelevant images in database, a very simple learning strategy is used by (SE)2A for selecting confident unlabeled images; also, a parallel structure is used for ensemble. As a result, the time-consuming learning process is avoided. Experimental results show that (SE)2A has fast learning ability and strong generalization ability, and it outperforms some prevailing interactive semantic inference algorithms.Finally, by analyzing the state-of-the-art of interactive semantic inference algorithms (including both short-term and long-term learning methods), the "dislocation" problem of retrieval results is pointed out. To attack this problem, a collaborative learning algorithm between visual content and hidden semantic, named CoSim, is proposed. Concretely, the semantic similarity is first learned from log data and serves as prior knowledge. Then, the visual similarity is learned from a mixture of labeled and unlabeled images. In particular, unlabeled images are exploited for the relevant and irrelevant classes in different ways. Finally, the collaborative learning similarity is produced by integrating the visual similarity and the semantic similarity in a nonlinear way. Theory analysis and empirical study show that CoSim is able to behave well for the dislocation problem, and significantly outperforms some existing approaches. |