| Genomic studies have revealed a close association between the highly folded threedimensional(3D)chromatin structure and gene transcription and expression.Therefore,research on the distribution of chromatin 3D spatial structure in cancer cells and the relationship between specific 3D structural units and gene expression regulation has become a major focus in genomics research.High-throughput sequencing technology has revealed differences in chromatin 3D structure among different cells and the existence of a loop structure involving interactions between gene promoters and regulatory elements(i.e.chromatin loops)within the chromatin.Using chromatin 3D structure information to identify cell types can help to explore the relationship between gene function expression and chromatin spatial structure in cells.Identifying chromatin loops within cells can deeply excavate the relationship between chromatin 3D structural units and gene transcription.However,traditional biological experimental methods for identifying cell types and chromatin loops within cells are time-consuming and labor-intensive.Therefore,an efficient and robust computational method is needed to predict cell types and chromatin loops in 3D genomic data,providing application tools and technical support for subsequent biological researchers.Accordingly,this study conducted research on cell classification and chromatin loop prediction algorithms based on 3D genomic data,with the main research content as follows:(1)Research on deep learning-based cell type prediction algorithmsGiven the low accuracy of existing cell classification algorithms based on single-cell Hi-C data and the need for manual labeling of cell types for the partitioned cell clusters,this study proposes a deep learning-based cell type prediction algorithm(SCANN).This algorithm improves the construction method of existing feature vectors based on the connection between intra-chromosomal interactions and genomic distance and builds a neural network model for predicting cell types based on deep learning methods.Compared with the classical machine learning algorithm and the existing scHiCluster algorithm,the improved feature vector in this study improves the ACC value of the final prediction results by 1.8%.Moreover,the ACC value of the neural network classification model based on deep learning on human single cell Hi-C dataset is 10.5% higher than that of the existing scHiCluster algorithm.(2)Research on deep learning-based chromatin loop prediction algorithmGiven the potential advantages of deep learning methods in predicting chromatin loops in 3D genomic data and the room for improvement in the accuracy of existing algorithms for predicting chromatin loops in whole-genome maps,this study proposes a deep learning-based chromatin loop prediction algorithm(Be-1DCNN).This algorithm builds a neural network model for extracting feature information for chromatin loops and predicting chromatin loops based on deep learning and combines Bagging ensemble learning methods to improve the reliability and generalization ability of the model.Evaluation and comparison with classical machine learning algorithms and the published Peakachu algorithm show that Be-1DCNN improves the MCC index value of chromatin ring recognition by 4.3% compared with the Peakachu algorithm,and has good generalization ability across cell lines and across sequencing platforms. |