Font Size: a A A

Web Image Retrival Based On Page Segmentation And Hyperlink Analysis

Posted on:2010-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:K Y GuoFull Text:PDF
GTID:2178360275974980Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The 21st century is the century of network, Network has been fully integrated into people's study, work and life. With the rapid development of Internet technology, Web has become an important way for people to access information. The image resource on the web is becoming richer and richer, and people's demand on web image retrieval also has become increasingly stronger.Now most of mature commercial image search engines are based on the method of text information retrieval, mainly due to the restrictions on system performance, as well as a large user groups, intensive query, shorter response time and other system requirements. Currently there are a lot of research focused on content-based image retrieval and a lot of algorithms and models have been put forward, however , they are largely in laboratory research stage, there is a big gap from business application. How to improve the performance of text-based image retrieval, especially from the respect of link analysis and text analysis to obtain semantics of images, this kind of research is very valuable.Relevant text-based search technology is already very mature, and after many years of practice and improvement, its accumulation of knowledge and related tools should be draw lessons from and reused in image retrieval. In Text-based image retrieval, the difficulty is to determine the relationship image and text . Web picture exists in the page, surrounded by a lot of useful and relevant text information ,that plays an important role to express the semantic attributes of image.But Link analysis-based image retrieval is still not mature enough, at the initial stage of development. How to use the link relationship combined with text around images to realize web image retrieval is of great importance. In addition, a web page typically includes a number of semantic blocks, each block is not in the same level of importance, so doing block-level link analysis is more reasonable and we can get higher semantic relevance.This paper focused on the research of web image retrieval, from basic retrieval theory, web page semantic segmentation and link analysis. The main work is listed as following:Firstly, we study the traditional theories and methods of image retrieval. from the basis of information retrieval, focus on the concept of image retrieval, architecture, the classification of image retrieval and their respective characteristics, analyse the environmental characteristics of Web images. Paper study the characteristics of web pages and images in it. make comparison of the current major Web image retrieval methods.Secondly, we analyse the characteristics of the organizational structure of HTML, put forward a semantic page segmentation algorithm based on Web standard, which is an important foundation of the web image retrieval research work.Thirdly, this paper constructs block-base graph model to research the relationship among page, block and image. Doing block-level link analysis on web pages in order to obtain semantic relevance.Finally, we research the framework model of web image retrieval based on page segmentation and link analysis, complete a prototype system . try to improve the accuracy of Web image retrieval.
Keywords/Search Tags:Web Standard, Web Page Segmentation, Link Analysis, Web Image Retrieval
PDF Full Text Request
Related items