Font Size: a A A

Recognition Of Protein Subcellular Location From Single-cell Images Based On Multi-instance Learning And Pseudo Labels

Posted on:2024-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:X L ZhuFull Text:PDF
GTID:2530306926986809Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Proteins are fundamental to life and play an extremely important role in the life activities of organisms.In eukaryotic cells,subcellular structures,such as cellular compartments or organelles,provide a specific biochemical environment for proteins to perform their functions.It is generally believed that the functions of a protein are associated with its subcellular locations.For example,the proteins localized in mitochondria may have the functions of promoting cellular aerobic respiration and energy production.Proteins can only perform their functions normally in specific subcellular structures,and abnormalities of subcellular locations of protein are involved in the disorder of cell metabolism,which are closely associated with pathogenesis of many human diseases.Therefore,the study of subcellular localization of proteins plays a key role in understanding the functional mechanisms of proteins,the diagnosis and treatment of diseases,and is of great significance in biology and clinical medicine.With the improvement of single-cell measurement techniques,there is a growing awareness that individual differences exist among cells,and protein expression distribution can vary across cells in the same tissue or cell line.Pinpointing the protein subcellular locations in single cells is crucial for mapping functional specificity of proteins and studying related diseases.Currently,research about single-cell protein location is still in its infancy,and most studies and databases do not annotate proteins at the cell level.For example,in the human protein atlas database,an immunofluorescence image stained for a particular protein shows multiple cells,but the subcellular location annotation is for the whole image,ignoring intercellular difference.Although spatial proteomics via immunofluorescence(IF)imaging has rapidly become an invaluable tool for bioinformatics research,methods that can fast identify single-cell protein distributions in IF images are still lacking.This thesis mainly researched the automatic recognition method of protein subcellular location distributions in single cells from immunofluorescence images.In view of the scarcity of single-cell annotated datasets and the poorly available automatic prediction models of single-cell protein distribution in current field,we used large-scale immunofluorescence images and image-level annotations of subcellular location to develop a deep-learning-based system that could accurately recognize protein localization in single cells.The system consisted of an image-based model based on multi-instance learning and a cell-based model based on pseudo-label algorithms.The performance of each model was verified on two independent single-cell test sets manually labeled by experts.Our experimental results showed that the system demonstrated high accuracy and robustness in comparison with traditional methods and state-of-the-art models,providing a new method for single-cell related research.Finally,to address the issue of highly imbalanced data distribution and inaccurate annotation in protein datasets,we developed a deep learning model,SImPLoc,with improved performance based on multi-instance learning.We introduced the pseudo-label method into the training of the multi-instance learning model,and designed a robust asymmetric loss function for the problem of data imbalance and inaccurate labels.Experimental results showed that the model had more robust performance and outperformed the current state-of-the-art models.
Keywords/Search Tags:Bioinformatics, Biological image processing, Multi-instance learning, Protein subcellular localization, Single cell analysis
PDF Full Text Request
Related items