Font Size: a A A

A Cloud-Based Large-Scale Offline Process Of Image Retrieval System

Posted on:2014-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:H ChenFull Text:PDF
GTID:2248330398971550Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, multimedia data on the network has an exponential growth, and the image and video data increases particularly fast. Multimedia-based applications put forward higher requirements on data processing and storage capacity. Traditional data processing methods can’t meet the vast amounts of data processing requirements. As an important part of multimedia applications, Content-Based Image Retrieval is also facing the same problem. At present, Google, Baidu etc. are using recently emergent cloud computing technology to deal with the massive image data. There are many issues when using the traditional content-based image processing methods to process massive image data sets, for example, the processing time is very long, the correctness of the results is hard to be guaranteed, the system is difficult to manage and maintenance. In order to solve these issues, this thesis uses the cloud computing technology to build the cloud-based image retrieval system, which solves the aforementioned problems reasonably. The main research work is as follows:Firstly, a large-scale cloud-based offline image retrieval processing system is implemented. The system consists of image data storage, image feature extraction, image feature clustering and image visual word quantization processing modules. All modules in the system are implemented in Hadoop platform, ail data in the system are processed in parallel, which improves the efficiency of the processing. As a cloud-based processing system, the system also supports scalable features on image data size and hardware configuration. As a proof of the effectiveness, the system has been running on a data set including6.5million shopping images.Secondly, two real-time retrieval algorithms based on image color feature are proposed. The two algorithms use the character of the image retrieval and color data’s property respectively to reduce the scale of the data processing, which increases the matching speed. Both algorithms achieve real-time goals on retrieval of6.5million actual feature data and10million analog feature data.
Keywords/Search Tags:Hadoop-based image processing, Feature extraction, Feature clustering, Feature quantization, Retrieval matching
PDF Full Text Request
Related items