| Motivation: Somatic mutations,alterations of the DNA which accumulate during the lifespan of an individual,is the main cause of cancer.With the advances in high-throughput sequencing techniques,many large-scale cancer research projects have generated thousands of cancer genome data.The major practical problem is to determine which mutations are drivers,confer selecting advantage to cancer cells.The heterogeneity of cancer is a big obstacle for cancer diagnosis and treatment.Prioritizing combinations of driver genes that mutate in most patients of a specific cancer or a subtype of this cancer is a promising way to tackle this problem.Methods: Here,we developed an empirical algorithm,named Path MG,to identify common and subtype-specific mutated sub-pathways for a cancer.Results: By analyzing mutation data of 408 samples(Lung-data1)for lung cancer,three sub-pathways each covering at least 90% of samples were identified as the common sub-pathways of lung cancer.These sub-pathways were enriched with mutated cancer genes and drug targets and were validated in two independent datasets(Lung-data2 and Lung-data3).Especially,applying Path MG to analyze two major subtypes of lung cancer,lung adenocarcinoma(LUAD)and lung squamous cell carcinoma(LSCC),we identified 13 subtype-specific sub-pathways with at least0.25 mutation frequency difference between LUAD and LSCC samples in Lungdata1,and 12 of the 13 sub-pathways were reproducible in Lung-data2 and Lungdata3.Similar analyses were done for colorectal cancer.Conclusions: Together,Path MG provides us a novel tool to identify potential common and subtype-specific sub-pathways for a cancer,which can provide candidates for cancer diagnoses and sub-pathway targeted treatments. |