Font Size: a A A

A Study On Models And Methods Of Visual Search Engine

Posted on:2011-09-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:M G HeFull Text:PDF
GTID:1228330332482970Subject:Information Science
Abstract/Summary:PDF Full Text Request
As the amount of web resources increasing very quickly, multimedia resources such as photo, audio and video, become favorite for users. In order to let users browse and exploit search result better, classical search engines need to improve the rate of accuracy, and to provide a new model to present big result set. On the other hand, search engine need new model to process multimedia resources. All these make search engine develop diversely, the character of this stage is intelligent, personality, clustering etc. There are many typical search engines, such as metal search engine, clustering search engine, vertical search engine. Visualization has its natural advantage, can help user to manage, analyze, control and understand a great of information. So, it’s a very important aspect in the diversified development of search engine that to implement visualization of search engine.Now, it’s the beginning of visual search engine studies, relevant studies are mainly about visual search, such as the visualization of query, the visualization of search results; only involve scattered study about visual search engine. There is not a systematic study about visual search engine model and integral framework of visual search engine. This paper studies the integral framework of visual search engine, and analyses the important, critical questions of visual search engine too, and then verify these theories and technologies.There are mainly six parts in this paper.The first section is about basis theories and technologies of visual search engine, it’s the basis of the following sections, and some technologies will be used frequently, mainly about basis theories of search engine and visualization. The main theories of search engine include search model such as Boolean model, vector space model, probability model, inference network model; search engine rank model such as PageRank algorithm, HITS algorithm; clustering algorithm and incremental clustering algorithm. The main theories of information visualization include concept, classification and model.The second section is about the model and framework of visual search engine, it’s the integral framework of the following sections. Firstly, the paper analyses the shortcomings of classical search engine including failure to parse the visual character of Web resources, browsing huge result set will increase the burden of user, lack of and interactive search environment. The principles of constructing the framework of visual search engine include user-centered design, universality, modularization, and scalability. From the point function, visual search engine consist of spider module, index module, retrieval module and user interface module. And then analyses the process of the framework, the layers of visual application and the objects of visualization. The key points of visual search engine include extracting the Web resources by visual method, constructing the model, user interface and integrating multimedia resources. Technology framework is a part of visual search engine framework too, mainly about visualization construction and implementation technologies.The third section is about indexing resources of visual search engine. Web page is described by HTML codes; generally, classical search engine indexes Web page by parsing HTML codes. Actually, what the user sees is the visual page rendered by browser that can reflect the author’s propose accurately. So, search engine should index Web page by visual information that can get a good result. After analyzing the visual elements of Web page, mainly study the layout of Web page, such as the structure of layout, the segmenting method, especially analyzing the visualization segregation method. And the rank of every block is an important factor for extracting the content of Web page. After automatic word segmenting, associate with the visual character of text such as color, font, size, and associate with the rank factor, can get the converted file with visual character. For multimedia resources, the key point is content based index; face recognition and voice recognition are most important technologies.The fourth section is about visual retrieval method. Visual information retrieval has been studied for years, but this paper mainly integrates with search engine, mainly about interface and result visualization of search engine. The interface includes visualization of query, query by example etc, the latter is very important for multimedia retrieval. Interaction is the most important problem of interface visualization. For those search engine based on directory, visualization of hierarchy information is another important problem. The visualization of results consists of macro information and micro information, that’s important for user to find valuable information. Those visualization methods include cluster method, link analyse method and semantic content method. And this paper studies three most important visualization method for result attribute such as visualization of cluster, visualization of relationship and visualization of time series information.The fifth section is about the historical data of search engine. There are two types of search engine historical data:Web page snapshots and search logs. There are many advantage of mining these historical data, such as user can get more information, and search engine can improve its search model, especially the rank algorithm. User can track the change of a page by comparing the snapshots. This paper gives a method to implement storage for the snapshots, and visualization of snapshots. Studies of mining in search engine logs include term frequency, geography logs and log sessions, and the visualization model of mining.The sixth section is about cases. This paper chooses Google and Wolfram|Alpha as the objects to analyze. The former is the famous search engine; the latter is a new one being online at May 2009. The conclusion of Google case is that there are many visualization technologies used in Google, and the study of visualization of Google in upward trend. Wolfram|Alpha is based on a huge visualization model library, gives a final answer instead of many results in visualization model.Visualization technologies used in search engine become more and more, relevant studies become more and more. Visual search engine is an integration production of all kind of technologies, its effect depend on relevant technologies and methods. After the study on model and framework of visual search engine, need to deeply study in relevant technology and method, it’s the next step of study.
Keywords/Search Tags:visual search engine, visual indexing, visual retrieval, visualization of historical search logs
PDF Full Text Request
Related items