Font Size: a A A

Detection Of Topologically Associating Domains On Chromosomes Based On Feature Extraction

Posted on:2023-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:K LiuFull Text:PDF
GTID:2530307070483504Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Topologically associating domains are local chromatin regions with strong self-interaction and complex spatial structure.Studies have shown that topologically associating domain boundaries are closely related to some functional regulatory elements,and also have links to genetic diseases and cancers.Therefore,detecting topologically associating domains is very important for understanding chromatin spatial structure,gene expression,and gene regulation.Although many computational approaches are widely used to detect topologically associating domains,they still have shortcomings when detecting topologically associating domains on highdimensional,sparse,and noisy Hi-C data.The shortcomings of these methods include the loss of information between long distances,different assumptions about the number and size of topologically associating domains,and failure to perform noise reduction on Hi-C matrices.Therefore,according to the features of topologically associating domains and their boundaries,different computational methods are proposed to detect topologically associating domains on Hi-C data in this thesis.The main contributions of this thesis are as follows.First,to address the problem that enrichment of regulatory elements near the topologically associated domain boundaries identified by current methods is not high enough,a deep learning-based model called YOLOTAD is proposed.Formulating the task of topologically associating domain detection as the task of object detection,YOLOTAD detects topologically associating domains using a model trained on simulated data.YOLOTAD uses networks like CSPDarknet53 network and spatial pyramid pooling network to learn features and semantic information of topologically associating domains at different levels.To improve the accuracy in detecting topologically associating domains,YOLOTAD then uses feature pyramid networks and path aggregation networks to fuse multi-scale features and semantic information in different levels from shallow and deep networks.On the Hi-C experimental data,the enrichment of regulatory elements around topologically associating domain boundaries identified by YOLOTAD is higher than that of other methods overall.It shows YOLOTAD has a degree of applicability.When analyzing topologically associating domains as a whole,YOLOTAD exceeds most methods in terms of the reproducibility of topologically associating domains,the proportion of topologically associating domains with significant DCC,and the accuracy of topologically associating domains on the simulated data.The results show that YOLOTAD can accurately learn topologically associating domain features and apply them to detect topologically associating domains.Second,in terms of high-dimensional,sparse and noisy Hi-C matrices affect the accuracy of topologically associating domains,a model called SNMFTAD is proposed.SNMFTAD detects topologically associating domains based on symmetric non-negative matrix decomposition.SNMFTAD first uses network enhancement technique to reduce noise for Hi-C data.SNMFTAD then combines symmetric non-negative matrix decomposition to effectively extract low-dimensional graph embeddings from Hi-C data.SNMFTAD can learn accurate similarity features between nodes,and apply them to detect topologically associating domains.On the Hi-C experimental data,the enrichment of regulatory elements around topologically associating domain boundaries identified by SNMFTAD is higher than that of YOLOTAD and other methods overall.It demonstrates SNMFTAD has a higher applicability than YOLOTAD.On the simulated data,SNMFTAD outperforms other methods overall when the noise levels are low,and is in the lead when the noise levels are high.The results show that SNMFTAD is more applicable and can effectively detect topologically associating domains.
Keywords/Search Tags:Topologically associating domains, Chromatin structure, 3D genome, Symmetric nonnegative matrix factorization, Deep learning
PDF Full Text Request
Related items