Font Size: a A A

Inference Of Pseudo-time Trajectories And Regulatory Networks For Single-cell Data

Posted on:2020-06-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Y WeiFull Text:PDF
GTID:1360330596481232Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The rapid development of single-cell sequencing technology in recent years can simultaneously measure the expression of thousands of genes in tens of thousands of cells in a single experiment,resulting in a large number of static single-cell snapshot data,which can reveal small differences between cells.The volume and complexity of single-cell data make it a paradigm of big data.It is a major challenges that how to get effective biological information from large heterogeneous single-cell datasets and reveal hidden associations,interactions and dynamics between genes.Single-cell data analysis has become a hot topic in computational biology and has attracted the attention of many researchers.This thesis mainly aims to infer pseudo-time trajectory and gene regulatory networks from single-cell data.Pseudo-time trajectory inference of single-cell data provides a new perspective for understanding the cell fate determination and gives a theoretical explanation for the dynamic mechanism of cell development.Modeling gene regulatory networks can help us to understand genetic function and the interaction mechanism between genes inside cells,and to promote the study of disease pathology.The main research work of this thesis is as follows:Firstly,we present a new pseudo-time trajectory construction algorithm SCOUT based on landmark cells.The method first projects the data into a low-dimensional space by the locally linear embedding algorithm.Then we put forward a method to look for landmark points based on cell density,which can find more landmarks than the conventional clustercenter method.Thus the minimum spanning tree will be more stable,and the impact of noise in single-cell data can be reduced.In addition,we come up with an idea to calculate singlecell pseudo-times based on the Apollonian circle projection or the weight distances,which improves the accuracy of SCOUT.Secondly,we develop a novel pseudo-time trajectory inference method DTFLOW based on the diffusion propagation.This method first constructs a nearest neighbor graph matrix based on the Euclidean distances of adjacent points for each point,then transforms the distance matrix into a Markov transition matrix,so we can transform each data point into a discrete distribution with random walk with restart algorithm.After that,we can construct a Bhattacharyya kernel matrix and devise new methods for dimension reduction and pseudotime trajectory with Bhattacharyya distance matrix.On the one hand,the dimension reduction process of the method adopts new tricks.On the other hand,the pseudo-time calculation of single cells does not depend on the dimension reduction step and can thus reduce information loss greatly.We also design a new method reverse searching on neighborhood graph to identify multi-branching progression.The experimental results show that DTFLOW is better than the current state-of-the-art pseudo-time trajectory algorithm.Thirdly,we design a new model for gene regulatory network construction of single cells based on the pseudo-time trajectory.Suppose that pseudo-times of different single cells has been obtained with one pseudo-time trajectory inference algorithm.The model uses a topdown approach and a bottom-up approach to study the regulatory network.That is,we first construct the gene regulatory network based on pseudo-time variation of expression levels for different genes,and then the system dynamics is studied by the differential equation model.The model takes into account the time dependence of gene regulation,and the experimental results show that it is very effective for gene regulation network inference.The thesis provides a new framework for single-cell data analysis,in which new algorithms and models are presented,and meaningful exploration are made.It offers new perspectives for future research.
Keywords/Search Tags:single cell data analysis, pseudo-time trajectories inference, dimension reduction, manifold learning, kernel methods, regulatory network inference
PDF Full Text Request
Related items