Font Size: a A A

Researching On Image Searching Engine Based On Text

Posted on:2009-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:D S XieFull Text:PDF
GTID:2178360242979355Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As there sources of images grow day by day on the Web, the demand of image retrieval becomes stronger and stronger. At present, the Web images retrieval based on text is the primary means for us to resolve the difficult problem of looking for the images on the Web. Although there have been some WEB images search engines applying this technology working in the world today, the technology has still been not perfect enough. Firstly, for lack of enough understanding to the complexity of the Web, according to a changeless mode the relevant texts of images are taken from HTML pages. Because only a small part of images on the WEB accord with the changeless mode, the effect of acquiring the relevant texts from HTML pages is not good enough. Secondly, to some extent the weight scheme is quite coarse to some extent because some factors influencing the weight of term are not brought into the scheme. Thirdly, indexing based on terms and query through word matching result in the rather severe problems of synonymy and polysemy.A series of techniques related to Web image search engine, such as crawling, relevance ranking (VSM and LSI), information extraction and indexing are dicussed in this paper. Those techniques will be used in our system design. It concentrates on how to extract information relevant to images from HTML documents more effectively and precisely. According to experiments and analysis on real data, several key techniques are proposed as well as a text-based Web image search engine. The global structure of our system and relations of the components of system are also described. Some components are detailed in function and implementation. Finally a simple evaluation about searching effect and performance is given.
Keywords/Search Tags:Web image search engine, text-based, content-based, information extraction
PDF Full Text Request
Related items