Font Size: a A A

Research On Image Retrieval Algorithm Based On Hadoop

Posted on:2017-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y D ShiFull Text:PDF
GTID:2348330488487670Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the continuous development of the Internet and the popularization of Internet service which makes people's life more convenient, user data is growing rapidly. Images that have the characteristic of concrete and easy to understand, have become the most commonly used multimedia data carrier in people's life. With the increase of the amount of data, the number of images is growing rapidly. How to get the image with similar characteristics or contents in the massive image data through the image matching technology becomes more and more practical, and the image retrieval technology arises at the historic moment. With the development of image retrieval technology, how to get the image with high efficiency and accuracy has become a hot and difficult problem in this field. With the rapid increase in the amount of data, more and more people pay attention to the image retrieval of massive data. The main research content of this thesis is to use Hadoop that is the open source distributed computing platform to complete the image retrieval of massive image data.Based on the image retrieval technology and Hadoop platform, this thesis mainly studies the storage of massive image data in Hadoop distributed file system and the implementation of content based image retrieval in MapReduce programming model. Hadoop distributed file system is the data storage and management system of Hadoop platform, it is responsible for the storage and management of data. The distributed file system can realize the storage of massive image data, and can effectively manage the image data. MapReduce is a programming model of Hadoop. It is can be used to realize the image retrieval algorithm, complete the image retrieval task and realize the distributed computing function.In massive data storage, the Hadoop distributed file system and Hadoop related technology for massive data storage has been used to store massive image data. However, the image data of the experiment is small file and Hadoop in handling small files is difficult to play its efficient performance. Based on the idea of sequential file, this thesis puts forward a kind of storage method of massive small image data.In content-based image retrieval implementation, the MapReduce programming framework has been used to realize the image SIFT feature extraction based on Hadoop platform, as well as the characteristic of clustering and quantization. The SIFT feature points of the image are extracted by clustering algorithm, which is expressed as a fixed number of several classes. Through the processing of feature quantization algorithm, a feature vector with the fixed dimension can describe an image. Finally, the Euclidean distance is used to measure the similarity between the feature vector and the feature vector of the image. The target image is obtained.Image retrieval technology has a broad prospect which can be applied to many fields, and it is the foundation of the popular technology gesture recognition. It is significant to study the image retrieval of massive data, and the application of Hadoop platform in image retrieval provides a way to solve the image retrieval of massive data.
Keywords/Search Tags:Image Retrieval, Hadoop, Distributed Computing
PDF Full Text Request
Related items