Font Size: a A A

Cross-media Data Retrieval Based On Hadoop

Posted on:2017-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:W C ChengFull Text:PDF
GTID:2348330518495536Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the development of the Internet,the data from the Internet becomes diversity and massive.The size of data that comes from Internet is huge and species is variable.How to take advantage of the relationship from different kinds of data and get the relative data quickly and exactly from the huge multimedia information has become a hot research area in the information retrieval.We have make a study on the Text Based Information Retrieval and the Content Based Information Retrieval in this paper.The specific of this paper is as follow:1.Find the relationship between the image and relative text to label the image.In this paper,the clustering algorithm and the text feature vector is researched and paralleled by MapReduce framework.It is tested by the public database,which shows the algorithm decreases the time cost and improve the preciseness.2.Make a research into the CBIR by Hadoop platform.We paralleled the image retrieval process and the image extraction process.Build the index of the image feature by the E2LSH and store them into HBase.We test the algorithm on the Corel image database,which shows that the time costs takes down while the preciseness remain unchanged.3.Build the Hadoop platform by using the idle machine of the laboratory and implement the algorithm and test on that.The system can run steadily.
Keywords/Search Tags:distributed, hadoop, CBIR, E2LSH, topic detection
PDF Full Text Request
Related items