The purpose of this dissertation is to develop new statistical tools for mining big data in plant sciences. In particular, the dissertation consists of four inter-related projects to address various methodological and computational challenges in phylogenetic methods. Project 1 aims to systematically test different optimization tools and provide useful strategies to improve optimization in practice. Project 2 develops a new R package rPlant, which provides a friendly and convenient toolbox for users of iPlant . Project 3 presents a fast and effective group-screening method to identify important genetic factors in GWAS, with theoretical justifications and nice asymptotic properties. Project 4 develops a new statistical tool to identify gene-gene interactions, with the ability of handling the interactions between groups of covariates. |