Font Size: a A A

Research On Resource Management And Task Scheduling For Grid Computing In The Application Of Biological Sequence Alignment

Posted on:2007-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhuFull Text:PDF
GTID:2178360182495988Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Computational grid is a burgeoning information science and technologyin recent years. Its main objective is to combine Internet into a whole verylarge-scale computer system in order to share all computing resources,storage resources, information resources and knowledge resources. Thecomputational grids have been increasingly used by various fields of thenational economy and the national defense. A computational grid systemconsists of various resources. These resources have some features ofdynamic change, geographical dispersion and heterogeneous systems,which will in a certain extent, hinder the development of grid applicationand present challenges for application basis research of the grid computing.The resource management and schedule are a very important research issuefor application basis in the field of high-performance computational grids.The objective of the research issue is to solve key problems of science andtechnology for specification, organization, management and schedule of thegrid resources.Combining theory with practice, this dissertation studies the keytechnologies of grid resource management and schedule. The majorresearch work and contributions in this dissertation are as follows:This paper describes a sort of new resource discovering andmanagement model based on level by investigating the current various gridresources discovering management mechanism. It considers the pros andcons of current various mechanisms synthetically and links several registernodes through tree structure to implement resource discovering andoptimize searching efficiency. There is a tier relationship between upper andlower subnet of the model, which would easy to control and manage thedynamic character of resource, facilitate subnet to join in dynamically thatprovides preferable error tolerance. It uses synchronal replication modelbetween brother nodes to manage resource that is integrity for the user toaccess the whole system through different entry points which are consistentin logic. Besides, the data stored in each node is same, which wouldimprove the query efficiency greatly. However, the centralization modelwith high query efficiency is applied to the model of lowest layer small gridsystem.In addition, the process could be speeded up by setting up two"pathway" for conveniently collecting the dynamic information of resourceand fast reacting resources alter status. The static information of thecomputing node is saved in each register node only;the dynamicinformation is saved in each resource node. When testing the performanceof each computing node, the dynamic information of resource can beobtained quickly through "the collecting dynamic information pathway"directly. Setting up "particular pathway" permits the common computingnode to reflect the information of computing resource itself or in the samelevel node to register node on the local virtual structure or even the rootregister node, such could quicken the pervasion speed of updating messages.Furthermore, the upper level register node also can know the resourceinformation of common computing node directly through this pathway toobtain the first hand updated information. This model provides a fast andeffective resource finding mechanism for grid task control.This article brings up a modified particle swarm optimizationalgorithm (MPSO), which combines with variance mechanism and partialsearch method, and first applies it to grid task control issues. The grid taskcontrol is a NP hard problem. For the algorithm of grid task control,numerous investigators have made in-depth research on this aspect. A fewof classical optimization algorithms have been introduced in grid taskcontrol gradually. The algorithms used most currently are genetic algorithmand simulated annealing algorithm etc. The particle swarm optimizationalgorithm (PSO) is a new rising bionics algorithm, which is attended greatlybecause of its global convergence similar to genetic algorithm but muchfaster convergence speed. It has been successfully applied to a lot ofengineering practical problems and achieves great optimization effect. Thisalgorithm well overcomes the shortcoming of premature convergencewithout reducing convergence speed and search precision. We use MPSObreaking a task into several tasks to implement task automaticdecomposition, consider communication and computing at the same time toachieve the control goal of minimum task completing time. That ismin{max ∑ Ci}, ∑ Ci is the time to complete all tasks on the ithcomputing resource. For the cost computing of each task, we introduce aformula Ttask_i=Tinput+Texe+Toutput and bring up a filtering technique toforecast the task runtime on each computing node more exactly, which is tofilter a similar record in history runtime to get the weight mean, then to getthe task forecast runtime Texe.In this article, it is proved by experiment that modified particle swarmoptimization algorithm (MPSO) is effective in the divisible grid task controlproblem. By comparing MPSO with GA and PSO, the graphic result anddata statistic result show that MPSO has faster search speed, is easier toconverge to global optimization, and is stable for receiving the optimizationvalue. MPSO well overcomes the shortcoming of premature convergencewithout reducing convergence speed and search precision, which greatlyimproves the global search ability of PSO.To apply the resource management model and task control algorithm,we construct grid biology sequence compare application. In the biologyinformatics research, the compare is the most important and classicalresearch method, which is to find a potential molecule evolutionrelationship by comparing the similar region and conservative bit betweentwo sequences. With the development of information technology, the scaleof gene sequence database is increasing rapidly in exponential, which willincrease one item every ten seconds on average. It would be a quitecomplicated work to pick required information fast and obtain relationaldata as most as possible at the same time in such infinite data resource.Therefore, it is necessary to connect all kinds of resources spread widelygeographically on the Internet as a whole logically, which is just like a supercomputer that provides integrative information and application service forbiology researcher through constructing a platform with high throughput ingrid environment to complete task cooperatively.This article designs and implements a distributed computing platformbased on Globus Toolkit 3, Biological Alignment Grid (BAGrid), which isbased on biology sequence compare of grid computing environment.BAGrid system structure is composed of grid portal layer, grid middle layer,extend service layer and grid crunodes. The grid portal layer as a resourceagent completes task global control, finds resources dynamically andprovides security authentication, task submission and inspection, and userfile management services for users to access grid resource and enjoy gridservices conveniently and directly through the Web. The grid middle layerprovides the basic service of grid, such as security management, indexservice, task management and resource management. The extend servicelayer uses grid service to employ existent application conveniently, such assequence compare tool BLAST. The grid crunodes are computing resourceand gene sequence database resource spread in different virtual fields.Thework will have a deep meaning in the application of the biologicalalignment。...
Keywords/Search Tags:Application
PDF Full Text Request
Related items