Font Size: a A A

Weighted Co-expression Network Analysis To Identify Genes Involved In Breast Cancer Development

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhaoFull Text:PDF
GTID:2504306518950209Subject:Cell biology
Abstract/Summary:PDF Full Text Request
Breast cancer is one of the most common cancers among women,with approximately 2million new cases each year and more than 1 million deaths worldwide.Due to the differences in regional culture,economic development,and medical standards,there are huge differences in the age of breast cancer,the period of cancer diagnosis,and the survival of patients in different regions.With one fifth of the world’s population in China,new cases of breast cancer in China account for 12.2 % of breast cancer worldwide and a mortality rate of 9.6 %.Globally,with the continuous development of biomedical technology,the treatment of breast cancer patients has become more diversified,which can greatly extend the survival time of breast cancer patients and reduce the pain of patients.At the same time,when most breast cancer patients are first discovered,the cancer is already in an advanced state.Because of the discovery period,it has brought great difficulties to the treatment of breast cancer patients.At present,the main ways to treat breast cancer patients are surgery,chemotherapy,radiotherapy,hormone therapy and targeted therapy.Although the diversity of treatment methods brings more hope to patients,both the new morbidity and mortality rate are increasing year by year,and the treatment becomes more difficult in the case of metastasis in breast cancer patients leading to extremely low survival rates.Therefore,in order to improve the survival rate of breast cancer patients and reduce patient suffering,it is an urgent need for better understanding the tumorigenesis and the tumor metastasis mechanism at the molecular level to develop better diagnostic and therapeutic tools for dealing with breast cancer.At present,about 46 biomarker genes have been approved by US FDA used as biomarkers for precise diagnosis of breast cancer.For example,BRCA1 is a tumor suppressor and is involved in cell replication and DNA synthesis.BRCA2 and BRCA1 have similar functions,and their genetic mutations have been used for diagnosis of familial breast cancer patients.Correspondingly,the patients who carry the BRCA1/BRCA2 mutations can be treated by PARP inhibitors.ERBB2 is a tyrosine kinase receptor.Its high expression will increase the risk of breast cancer recurrence.PIK3 CA is a catalytic subunit of PI3-kinase.Its gene mutations can lead to breast cancer.CDH1 encodes cadherin,and its mutations may increase the risk of breast cancer.However,these targeted genes can only affect some patients,so it will be critical to find new marker genes or targeted genes related to breast cancer.With the rapid development of next-generation sequencing technology,revolutionary changes have been made to traditional sequencing technology,making it possible to perform a faster,more detailed and comprehensive analysis of the transcriptome and genome of a species.In biomedicine,the next-generation sequencing technology is combined with various omics to continuously study cancer,resulting in explosive growth of sequencing and the other biological data.Computer-based bioinformatics technology has become the most common technology for fully mining and analyzing cancer sequencing data.At the same time,these data mining and analysis algorithms have also become a new way to find key genes related to cancer.At present,breast cancer data mining is mainly to identify breast cancer prognostic genes.Using weighted co-expression network analysis method,the GEO database breast cancer chip data is selected as the original raw data.The variance is processed on the original data.The relevant modules are identified to obtain the key genes and verified thereafter.In this study,the breast cancer RNA-seq data in the TCGA database is initially chosen for the data mining analysis.Compared with the chip data,RNA-seq data have more genes to be examined.It is also more sensitive to analyze the differentially expressed genes with higher-fold changes.In order to make the preliminary screening data more biologically meaningful,this study used DEseq2 to conduct a differential analysis for preliminary screening.WGCNA is a systems biology method.Compared with other data mining methods(such as cluster analysis,regression analysis,factor analysis,etc.),this method can more effectively divide genes into highly related gene clusters.Next,we construct a scale-free relationship network that is better in line with biological significance,and explore the correlation between gene clusters and pathological information to determine the changes of key genes in gene clusters.The key genes obtained from the analysis were verified by databases such as Cancer RNA-seq Nexus,the Human Protein Atlas,and UALCAN.As a result,we found that the expression levels of the Hub genes are different in cancer and adjacent tissues at different expression levels.In addition,these Hub genes are likely to play the important roles involved in cancer cell growth cycle,invasion and migration.The involvement of these hub genes in carcinogenesis and development was further verified.In the cell level experiment,we used CRISPR-Cas9 gene editing technology to obtain selected gene-deleted cells.Through cell scratch test and cell growth cycle verification we confirmed that the Hub genes including NCAPG,NCAPH deletions could affect cell migration and growth,since knock-down of or interfere with these genes has the inhibitory effects.These Hub genes that have been verified may be new biomarkers and the novel therapeutic targets used for testing or treating the breast cancer.
Keywords/Search Tags:Breast cancer, Weighted co-expression network analysis(WGCNA), TCGA, RNA-seq, CRISPR-Cas9
PDF Full Text Request
Related items