Font Size: a A A

Genetic Algorithms: Optimizing The Predicted DNA Supercoil

Posted on:2012-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:Bolou Dickson Bolou B LFull Text:PDF
GTID:2178330335950199Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Living organisms aspire to get adapted to the environment, in the process to survive weaker ones die out and others with better fitness tend to survive. From one generation to another the organisms that evolve may show better qualities than the preceding generation.The development of the society through scientific inventions has to do with the problems of optimization. In general terms, a large part of mathematical and other scientific developments during the past centuries dealt with repeated problems of optimization in researches where one needed to obtain the derivative of a function to find its extremes. To solve such problems were really not that easy or a pleasurable thing to do, because so much time was inevitably used. Let us take for example; if some children are put in a hall, each person to a corner and asked to find some coloured pebbles on the floor among many non-coloured ones, if one coloured pebble is placed on a table which is above others in height, after sorting out the coloured ones on the floor such a student may claim to have solved or reached the optimum solution of his or her problem but the pebble placed at the top of the pedestal was not seen. This method is efficient, but there is no proof that the optimum solution has been found, each child had an obstacle blocking a local optimum. It can be seen that this method is only effective with reduced search spaces.Genetic algorithm is employed in this optimization project, it is basically, about the application of genetic algorithms to optimize a stretch of a predicted DNA supercoil. This has become a very important field of study in recent years and has so much importance.Inputs are made on MATLAB platform with a given population size, a chosen generation length, to give a number of corresponding output for each command. The aim is to optimize the longest length of successive zeros on the DNA stretch which is made up of 1s and Os with 18 rows and 2421 columns. After all the results are achieved, the best result is chosen as the most optimized output which is 23419.In GA, a chromosome can be described as a set of parameters that define an expected result to a given problem which a genetic algorithm solves; such a chromosome is often represented by a string of usually binary numbers of "Os" and "Is". For example, a GA, is expected to solve a given problem whose solution lies within 0 and 271 of a population, and the solution is a chromosome in the population, say; 173. If our chromosome is an 8-digit chromosome then it will be 10101101. Every living thing is composed of cells; a cell is a fundamental unit of life, apart from viruses. Human beings are composed of billions of cells (multi-cellular) and some other organisms are single-celled (unicellular). These cells carry information about an organism's make up, stored in the genetic material called DNA. The DNA is composed of a long chain of molecules known as nucleotides.In the 1950s, some group of researchers took it upon themselves to determine the DNA structure. Leaders of these groups were; Maurice Wilkins of the first group in King's College London, another group's leaders Franklin Crick and James D. Watson in Cambridge then, the third was Linus Pauling in Caltech.John Holland, from the University of Michigan started researching on genetic algorithms at the beginning of the 1960s. Holland had two purposes (1); to improve the understanding of natural adaptation process, and (2); to design artificial systems having properties similar to natural systems. Holland's method is effective because he not only considered the role of mutation, as mutation improves the algorithms, he also used genetic recombination, (crossover):this recombination, which is the crossover of partial solutions to improve the capability of the algorithm to get closer and eventually find the optimum solution.A chromosome contains a single but very long DNA helix on which these numerous genes are encoded. Bacteria are found to store their genes on a single large circular chromosome or in extra small circles of DNA known as plasmids that normally encode, just a few genes which are easily transferable between individuals. The study of DNA is a rapidly expanding field of modern day research, which involves a wide range of challenges like DNA sequencing and analysis, the prediction of RNA structure and more. As time goes, new scientific problems are frequently coming up especially, in biomedical application. Again, it is a known fact that most of the problems are relatively complex in nature. Hence, such problems are hard to have a reasonable solution and difficult to solve by ordinary simple methods. To this, genetic algorithm becomes an indispensible and powerful optimization technique in solving scientific problems like optimizing the DNA supercoil. The discovery of Deoxyribonucleic Acid (DNA) in the study of genetics is a great scientific achievement in the field of genetics.Fitness Function:It can be a specific objective function that prescribes the optimality of a solution (that is, a chromosome) in a GA for a purpose that, that particular chromosome maybe ranked against all other chromosomes.The basic form of GA involves three types of operators; selection, crossover/recombination (single point) and mutation. 1 Selection:It selects chromosomes for the purpose of reproduction of offspring from a population. An important criterion for the selection is that; the fitter the chromosome the more times it is likely to be selected for mating to reproduce.2 Crossover:Used to vary the programming of a chromosome(s) from one generation to the next. This is to choose randomly, some crossover point and everything before this point, copy from a first parent and then everything after a crossover point copy from the second parent.3 Mutation:The purpose of mutation in GAs is preservation and introduction of diversity. Mutation takes place after a crossover process, like we said before it is due to copying error.DNA is what holds the genetic information which enhances living things to function, reproduce and develop in nature. Though, there is no substantial evidence of ancient genetic systems or traces to convince us, as recovering of the DNA from a large number of fossils is not always an easy task to accomplish, sometimes it is impossible to achieve our intent to get the details of the DNA. One common obstacle is because DNA will only survive in the environment for about less than a million years, then, gradually degrades or break down into short fragments in solution.Recently, bioinformatics is a sharply advancing science and engineering field, after the discovery of DNA and RNA (ribonucleic Acid). Recently, the use of computer software and through the ever advancing computer science and technology tools has made tremendous contributions to human life. One of such computer application is; Genetic Algorithms (GA), based on DNA sequencing and optimization. In fact, it is used to solve problems in, electronics, bio-medicals, computer science, economics, complex engineering problems, criminal cases, family's social and moral problems like; identification of genuine biological parenthood of children etc. Bioinformatics as a field is becoming more and more advanced with sub-fields such as molecular biology, genetic engineering, biochemistry etc. This field of DNA increasingly develops with the continuing development and advancement of computer software.The aim of this study is to optimize Escherichia DNA supercoil, it is basically, about the application of genetic algorithms to the predicted Escherichia coli DNA supercoil stretch, of an m n-length string each line stands for a whole string having 2421 bits and there are 18 rows or lines. The objective of optimizing this predicted DNA supercoil is to obtain a nearly best order of consistent maximum zeros. With the knowledge of DNA sequencing, which is a computational modeling of a scientific problem used in optimization for optimal solutions, is a lead way to Genetic Algorithm application and DNA optimization. DNA is contained in chromosomes which is a completely organized structure of DNA and protein which is found in cells of organisms that is made up of a single piece of coiled DNA (deoxyribonucleic acid) having many genes, nucleotide sequences and some regulatory elements. The number of chromosomes varies in different organisms. The DNA molecule of a chromosome can be linear.A string of DNA Supercoil is what is to be optimized using a MATLAB application. DNA supercoil is, a double helix DNA segment of two strands twisted around the helical axis one time at every 10.4-10.5 base pair of sequence. When a DNA is closed, that is when the two ends are joined to form a circle and then permitted to move freely, this DNA changes or contorts to a different shape of 8, called supercoil. If it is allowed to further, each additional helical twist it accommodates, the lobes will indicate one or more rotation about their axis. The supercoil in the context of DNA topology, for a global contortion to form a figure-eight is known as writhe. Additional twists are positive which lead to a positive supercoiling and subtractive twists result to negative supercoiling.The benefits of DNA supercoiling and optimizations is that, it is very useful in DNA packaging within all cells, since the length of DNA can be thousands of times the size of a cell. Packaging this genetic material into a given cell or nucleus is a problematic task. DNA supercoiling helps in mitosis or meiosis, as the DNA must be compacted and segregated to daughter cells. Supercoiling is also, necessary for RNA or DNA synthesis. Since DNA must be unwound in DNA/RNA polymerase action, as the enzyme polymerase has the function of polymerization of new RNA or DNA to an existing RNA or DNA template in the replication and transcription processes. More so, supercoiling plays a key role in osmotic induction during transcription, the rate of coiling affects the interaction level of the DNA with other molecules. Supercoiling changes the shape of DNA and moves faster than a relaxed DNA molecule.In addition, supercoiling makes way for easy manipulation and enhances a quick access to coded information in a DNA strand. As information about an organisms make up is stored in the DNA, such accessibility of information is of great importance in DNA manipulation. If copying of DNA molecule is done by a cell, the DNA strand will uncoil, then copy and recoil it back.To achieve the goal of this optimization project, inputs are made on MATLAB platform with a given population size, a chosen generation -length, to give a number of corresponding output for each command. The aim is to optimize the longest length of successive zeros on the DNA stretch which is made up of 1s and 0s with 18 rows and 2421 columns. After all the results are achieved, the best result is chosen as the most optimized output is 23419 after comparison.The results obtained in chapter three show different arrangements as the population and generation were gradually varied incrementally. The tracks of the iteration- output are included in the appendix and give a clearer picture of the results of the fitness values. Each output has a graph to indicate the position where the best fitness value lies. On the vertical axis it represents, the columns of the DNA stretch that has been optimized and on the horizontal axis, it represents the generation of the population which the optimization will end after the input command. In the above paragraph, it is said that the best optimized result is 23419, that is; from the 8th input command, the graph there shows that, there is a jump in the fitness values from the beginning to about the 30th generation and slowly to about 175th generation. However, the fitness value (23419) is highest from 175th to the 400th generations.The contemporary level of researches on DNA supercoil has contributed greatly to so many aspects in improving lives. Most prominently, is the application of GA in optimization, it has yielded enormous benefits in electronics, economics, solving non-deterministic problems like the Travelling Salesman Problem (TSP), medicine; for instance improved antibiotics, chemotherapy in cancer treatment.Some findings of the importance of DNA supercoiling show that; DNA supercoiling has large influence in replication, transcription and recombination to promote structural feature DNA favourable interaction with proteins and enhances the local concentration with sites of DNA which react with proteins. The prospects of the optimization of DNA supercoils are still great for the goal is always to get better result; therefore, more research work should be done on the use of computer software based projects.
Keywords/Search Tags:DNA Supercoil, Fitness Function, Operon, Selection, Mutation
PDF Full Text Request
Related items