Distributed Based On The Search Engine Irst Improvements

Posted on:2009-07-05

Degree:Master

Type:Thesis

Country:China

Candidate:H He

Full Text:PDF

GTID:2208360272459188

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the development of internet applications, network applications and services have become more and more common among software systems. Various forms of distributed systems are composed of the multiform network environment as well as the multiple types of applications and services. How to enable the internet applications and services to communicate with each other and how to make the customer system possible to discover and invoke the applications and services in a unified and standard way has become a practical and important topic. The Web Services, proposed by the international standard organization, is to solve this problem. It has a series of related network standards. The search engine, as the most important network application service, should provide the distributed invoking method that can be used by other client applications conveniently. The search engine based on the IRST (Inter-relevant Successive Tree) was implemented as a stand-alone software application, which can be used only in a single machine and does not have the distributed invoking ability. This article has described the improvement from the original search engine to a distributed system using Web Service technology.Along with the development of CPU manufacture industry, technology of the fabrication has already encountered with a physical limit and the traditional Moore's Law has already expired. People can hardly increase CPU frequency and CPU manufacturers are focusing on the multi-core design craft these days. It is no longer practical to expect better performance from the increase of CPU frequency. A new computing method known as distributed computing is on the horizon, the most important feature of which is that the application runs parallel on a computer cluster composed of many single nodes (single core or multi-core). This computing methodology is especially suitable for large scale data processing such as the indexing of search engine. In this article, we use the MapReduce distributed computing framework to improve the indexing of the search engine that is based on IRST. In this way, the indexing process can be completed with the parallel processing on the computer cluster. As a result, this method can reduce the consumed time for indexing to a large scale.

Keywords/Search Tags:

Search Engine, Distributed System, Web Services, Inter-relevant Successive Tree, Distributed Computing, MapReduce

PDF Full Text Request

Related items

1	The Key Technologies Of Search Engines And Implementation
2	The Research And Implementation Of Distributed Search Engine Based On Mapreduce
3	A Mathematical Expression Retrieval Model Based On Inter-relevant Successive Tree
4	Research And Implementation Of Distributed Web Crawl Based On Hadoop Architecture
5	The Design And Implementation Of Distributed Search Engine Based On MPI
6	A Similar Image Search Engine Based On Millions Of Images And Distributed Computing
7	The Research And Application Of Search Engine Based On Hadoop
8	The Design And Implementation Of Distributed Rules System Based On MapReduce
9	Design And Implementation Of Focused Search Engine In Hadoop Platform
10	Key Technology Study On The Cloud Computing Platform In The Field Of Search Engine