Font Size: a A A

Architecture And Optimization Of Enterprise Search Engine Based On Web Service

Posted on:2009-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:X Y WuFull Text:PDF
GTID:2178360242480314Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Most of the existing search engines are used to be hard-coded, and the lack of unified standards, application code is complex. It is costly to maintain and update system. Enterprises will face great trouble with the development of the high cost and low reusability . Web Service technology through years of development, is becoming more mature, and with unique packaging intact, loosely coupled, the use of compact standard and the use of standard agreements, the ability of highly integrated features, it is very suitable for large-scale systems structure.In response to these issues, the paper gives an enterprise search engine system architecture based on Web Service, and the search engine will be made crawling and search functions as Web services package. In this way, companies can use the web service in the various systems to build their own search engine, and do not need to develop a new search engine again. This model greatly enhanced the reusability of search engine software.Web service is a technology that allows applications to communicate with each other in a platform- and programming language-independent manner. A Web service is a software interface that describes a collection of operations that can be accessed over the network through standardized XML messaging. It uses protocols based on the XML language to describe an operation to execute or data to exchange with another Web service. A group of Web services interacting together in this manner defines a particular Web service application in a Service-Oriented Architecture (SOA).Integrating software applications across multiple operating systems, programming languages, and hardware platforms is not something that can be solved by any one particular proprietary environment. Traditionally, the problem has been one of tight-coupling, where one application that calls a remote network is tied strongly to it by the function call it makes and the parameters it requests. In most systems before Web services, this is a fixed interface with little flexibility or adaptability to changing environments or needs.Web service uses XML that can describe any and all data in a truly platform-independent manner for exchange across systems, thus moving towards loosely-coupled applications. Furthermore, Web services can function on a more abstract level that can reevaluate, modify or handle data types dynamically on demand. So, on a technical level, Web services can handle data much easier and allow software to communicate more freely.With universally defined interfaces and well designed tasks, it also becomes easier to reuse these tasks and thus, the applications they represent. Reusability of application software means a better return on investment on software because it can produce more from the same resources. It allows business people to consider using an existing application in a new way or offering it to a partner in a new way, thus potentially increasing the business transactions between partners.Google is a very well-known search engine, the search mechanism is for the general public, and its Web Service functions provided by the use of standardized Web agreement, in any network environment can use this service for custom personalized search engine provide more convenient.The search engine system architecture is using the B/S model, combined with Ajax technology, user interface becomes more efficient as well as desktop applications; it uses DWR as a framework of Ajax; business logic layer is using IoC of Spring framework to do the management work of JavaBean.The search engine is generally made up of Crawler,index storehouse,searching device and user interface. Crawler downloads pages from Web; parser analyses the content of the page which will be indexed into data house; the index store documents into database with the format which is easy for search; the searching device is responsible for user query and computation of aimed document matching; user interface offer user a web page which give user a place to input query and customize query results, then it returns the results to explore.In the paper, when achieving search engine crawling and indexing functions, it uses JAVA multi-threaded to do the task parallel, as a result the efficiency of the search engine is improved. It uses thread pool to manage thread, and optimize the system's allocation of resources, reduce the creation and destruction of threads time and improve the efficiency of the work of search engine. Apply work queue to achieve the communication between crawling thread and indexing thread.Creating and destroying object are big cost of time in object oriented programming, because creating a object needs memory or many other resources. It will happen more frequently in Java, JVM will try to trace every object in order to put the object to garbage after destroying. So decreasing the time of creating and destroying is a good way to improve the efficiency of service, especially the creating and destroying which are resource-cost. The paper implements a thread pool based on Java , which includes four parts: thread pool manager, work thread, task interface and task queue. They collaborate to implement function of thread pool. Crawler need to get thread from the pool, and when tasks are allocated, wait() and notify() function will be used. It can not only decrease the time of creating and destroying threads, but also control the resource of system.In a word, the paper design a enterprise search engine architecture based on Web Service, and it applies multi-thread to improve the search engine function. As a result, enterprise search engine will become more efficient, and have better extensibility, reusability and user interface.
Keywords/Search Tags:Architecture
PDF Full Text Request
Related items