Font Size: a A A

Mining genes from crop genebanks

Posted on:2001-06-21Degree:M.ScType:Thesis
University:Queen's University at Kingston (Canada)Candidate:Addala, Venkata Krishna RajuFull Text:PDF
GTID:2468390014456515Subject:Computer Science
Abstract/Summary:
In this research we examine how data mining can be used to optimize the management of crop genebanks. In particular, we demonstrate how these techniques can be used to identify useful genes from the genebanks. The base collection used for this study is the chickpea genebank maintained at the International Center for Agricultural Research in the Dry Areas (ICARDA), Aleppo, Syria. This study examines both the data mining techniques that can be used to discover useful genes and the role that various sampling strategies can play in identifying subsets that take on the role of core collections. The techniques used were decision trees, rule inducers and neural networks. Nine sampling strategies were used to derive core collections from the ICARDA collection at five sample sizes. The relative efficiency of the various sampling methods was determined using five criteria on four stress resistant genes. We also demonstrate the usefulness of boosting in data mining.; Our results clearly indicate that data mining can indeed help the genebank curator to better manage the genebanks by identifying desirable genes. The choice of an appropriate sampling method depends on the frequency distribution and properties of the target gene. The intrinsic value and the reinforced value of the core collections increased linearly with the increase in the sample size. However, the predictive power did not increase with the increase in the sample size.
Keywords/Search Tags:Mining, Genebanks, Genes, Used
Related items