Font Size: a A A

Research On Cell Trajectory Inference Algorithm Based On ScRNA-seq Data

Posted on:2022-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:C GuoFull Text:PDF
GTID:2480306779471594Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
The rapid development of scRNA-seq technology allows researchers to characterize the cell types,states and transitions during dynamic biological processes at single-cell resolution.In recent years,researchers have used scRNA-seq data to study cell state changes during development,and further understood cell differentiation,development,disease and other complex cellular processes.Inferring pseudo-time trajectory of cells based on scRNA-seq data has become one of the key tasks in studying cell state transition.At present,there are many cell trajectory inference algorithms.These algorithms generally reconstruct the pseudo-time trajectory of cells according to the similarity between cells,and further infer the pseudo-time ordering of cells sorted along the trajectory.Due to high dimensionality and noise in scRNA-seq data,how to propose an efficient,accurate and robust algorithm still faces great challenges.In order to effectively solve the above problems,this thesis proposes the following two methods:Firstly,this thesis proposes a new cell trajectory inference method sc Tite based on transition entropy,to identify transitional states and reconstruct cell trajectory from scRNA-seq data.Taking into account the continuity of cellular processes,this paper introduces a new metric called transition entropy to measure the uncertainty of a cell belonging to different cell clusters,and then identify cell states and transition cells.Specifically,this paper adopts different strategies to infer the trajectory for the identified cell states and transition cells,and combine them to obtain a detailed cell trajectory.For the identified cell clusters,sc Tite utilize the Wasserstein distance based on the probability distribution to calculate distance between clusters,and construct the minimum spanning tree.Meanwhile,sc Tite adopts the signaling entropy and partial correlation coefficient to determine transition paths,which contain a group of transition cells with the largest similarity.Then the transitional paths and the MST are combined to infer a refined cell trajectory.This thesis validates the performance of sc Tite algorithm on four real scRNA-seq datasets and a more complex integrated dataset,and compares it with several advanced algorithms.Extensive experiments show that sc Tite can reconstruct cell trajectory and pseudo-time ordering more accurately than other algorithms.Next,this thesis proposes a new cell trajectory inference method TIRV based on RNA velocity.Based on RNA velocity and Euclidean distance of cells,TIRV firstly constructs the weighted directed KNN graph under different K values.For each KNN graph,the Louvain algorithm is used for community detection to determine the cell clustering label,the transfer matrix between cell clusters,and the minimum spanning tree between clusters.Then,TIRV calculates the branching probability of cells,and introduces the branching entropy to determine the branching point in the differentiation process and the most suitable KNN graph.Based on the branching point in the trajectory and the KNN graph at this time,TIRV constructs MST and infers a group of navigation point cells on MST.Finally,TIRV takes the shortest path between the navigation point cells as the final pseudo-time trajectory.This thesis evaluates the performance of TIRV on four simulated scRNA-seq datasets,and compares it with several advanced trajectory inference algorithms.Finally,this thesis also analyzes the influence of KNN graph on the accuracy of pseudo-time ordering,and verifies the ability of TIRV to select the appropriate K value.
Keywords/Search Tags:Cell trajectory inference, Transition entropy, Wasserstein distance, pseudo-time ordering, RNA velocity
PDF Full Text Request
Related items