Font Size: a A A

The Research Of Text-based Image Retrieval Technology In Uyghur Kazak Kirgiz Search Engine

Posted on:2012-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y M Y M M T ReFull Text:PDF
GTID:2218330335986248Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This paper mainly discussed on research and implementing of text-based image retrieval in the Uyghur, Kazak, Kirgiz search engine. By analyzing some search engine related technologies such as web crawlers, relevance ranking, information extraction, Information Indexing, this paper proposed solution of initial system and implemented text based image searching module of Uyghur, Kazak and Kirghiz search engine.First of all, this paper take the Uyghur web pages as example, study and analysis how to extract text information related to the image from the HTML document, to ensure achieve high efficient and accurate text-based image retrieval in the Uyghur, Kazak, Kirgiz search engine. On the base of experiment real data analysis on several number of key techniques use for system design.Secondly analysis carefully the structure of HTML components including<img> tag,<a> tag, title of web page, anchor text of web page, URL of image,<meta> tag,<table> tag, surrounding text of <img> tag etc. Used to describe the image as related text information. Introduced two specific extraction methods:DOM-based approach and string-Based method. Analysis how to filtering useless image and potentially some of the latent rules of HTML.Finally Described the implementation of text-based Web image retrieval process in the Uyghur, Kazak, Kirgiz search engine. And given the system's overall structure, and get web pages, extracting information, check image capture and death chain, generate thumbnails, index and provide access to these processes are described in detail.
Keywords/Search Tags:image retrieval, information extraction, HTML tags, related texts
PDF Full Text Request
Related items