Font Size: a A A

The Research On Key Technology Of Text Retrieval In Web Image Retrieval Based On GPU

Posted on:2012-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:W JiangFull Text:PDF
GTID:2218330362956557Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The arm of information retrieval is to find the information from network that users are interested in, but data processing become more and more difficult as the explosive growth of network information. Web image search is a kind of application of information retrieval, the purpose is to extract image from web pages and index them for users to search. The main problem of Web Image Search is retrieve precision and show of data updating cycle. There are some methods, such as improve the precise of image text semantic information, content-based image search, classification search, interactive search, to solve the problem of precision. On the other hand, researchers use cluster computing or a new kind of hardware named GPU to short data update cycle.Design the Content-Based Image Retrieval (CBIR) system based on the CPU-GPU clster. CBIR including web crawler module, information extraction module, image processing module, text indexing module, feature clustering module, memory index module and user feedback module, use a center database server to handle the communication between modules.Focus on the method which extracting the text semantics of images from web pages based on GPU. Use the attribute of tags of web pages and local visual feature to extract the semi-structured data of web image. We focus on the mode of data processing based on GPU. As GPU does not support dynamic data allocation, the method pre-allocation memory in device memory and contracture the hierarchical data; consider the SIMT feature of GPU, implement an adaptive thread configure to match the number of data and the space of data in device memory. To improve the performance of method, we implement a string process library using GPU parallel data structure such as char4.Implement the sort and search algorithms of the text indexing. Firstly, the whole process of sort is divided into several stages; the whole process is paralleled by the type of pipeline. The data set is divided into many sequences on CPU by quick sort, and then these sequences are sorted in share memory of GPU.By the test, the description of the text to the web images is more completely and exactly. The function and performance relative to the original system has been great improved. As the key algorithm of text index be replaced by new ones, the efficiency of text index executing also improved.
Keywords/Search Tags:Web image retrieval, GPGPU, information extract, sort, search
PDF Full Text Request
Related items