Font Size: a A A

The Research On Memory Indexing And Integration Of Clustering Technology In Web Image Retrieval

Posted on:2009-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:F LuoFull Text:PDF
GTID:2178360278964238Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of computer technology and the improvement of network bandwidth, there are more and more Web images because of the rich resources. Most of Web images are embedded in pages, so they constitute a huge "Web image database." Web image retrieval helps users to quickly access to the information which they needed on the complex Web environment. The bottlenecks of current Web image retrieval are how to increase efficiency and how to annotate image of semantics.Text-Based Image Retrieval (TBIR) is the main technology in the current commercial image search engine, which depends on the text only to indirectly retrieve Web images. In contrast, Content-Based Image retrieval (CBIR) has recently receveived a great deal of interest in the research community, the major charllenge of which is the semantic gap problem, i.e. the gap between the low-level visual features and the high-level semantic concepts.We propose the memory indexing algorithm of Web images, on the basis of the approximation algorithm for the Earth Mover's Distance (EMD). Down through Mitigating the Problem of High Dimension by the weighted average centers, the balanced binary searching tree by memory indexing is built. The index are stored in memory,in order to effectively decrease frequent visits of disk I/O, and significantly improve the speed of the system retrieval. By improving the system retrieval model, the global retrieval model is proposed. First query is based on the scope of the K-Nearest Neighbor (KNN) algorithm. Many of the cluster centers, which do not affect the query results, are filtered to reduce the number of matching operations. Second, EMD algorithm is used to find the K cluster centres, which are similar to the sample image, with less time to get better results than hierarchical retrieval model.Because of the Web images have multi-modal characteristics obviously, based on the content features and textual features of images of multi-modal integration clustering method is proposed. The key idea is to using the content features and textual features while in the process of clustering. So it can simultaneously leverage all types of data which are related to Web image, explore their mutual reinforcement, and construct the association between textual features and content features to bridge the semantic gap.Using this method significantly improve the accuracy of annotation of the images, making the similar Web images put into the same cluster as far as possible, in order to improve accuracy of the retrieval.Based on the test analysis in the VisuAl & SemanTic image search (VAST) system, it proves that memory indexing method of Web iamges spends only 1/3 around time than the original retrieval, under the premise of high precision. The integration of clustering achieves a relatively good retrieval results, in relation to sequence retrieval scheme with the precision of 98.1 percent.
Keywords/Search Tags:Web image retrieval, memory indexing, multi-modal, integration of clustering
PDF Full Text Request
Related items