| Estrogen receptor-positive (ER-positive) and ER-negative are two important subtypes in breast cancer clinical practice. They are sensitive to different adjuvant therapies. For example, many ER-positive patients are sensitive to tamoxifen endocrine therapy, whereas almost all ER-negative patients are not. Some studies have identified the differentially expressed genes between the two subtypes and developed several predictors to predict tamoxifen sensitivity of ER-positive breast cancer patients. However, their predivtive power often markedly decreased in independent inter-laboratory validation cohorts. Therefore, this dissertation aims to develop an accurate and robust predictor for predicting tamoxifen sensitivity of ER-positive breast cancer patients.Firstly, in a microarray dataset of 519 breast cancer and 63 normal control samples, two gene classes that dysregulated in both ER-positive and ER-negative cancer versus normal control were identified:(i) genes dysregulated in the same direction but to a different extent, and (ii) genes dysregulated to opposite directions. The two classes of genes were then validated in a RNA-sequencing dataset with 281 breast cancer and 49 normal control samples of independent cohorts. The genes dysregulated to a larger extent in ER-positive cancer and in ER-negative cancer enriched in carbohydrates and lipids metabolic processes and cell proliferation associated processes, respectively. On the other hand, the genes oppositely dysregulated in the two subtypes significantly enriched with known cancer genes and tended to closely collaborate with the cancer genes. Furthermore, these genes could contribute to carcinogenesis of the two subtypes through rewiring different subpathways. In spite of the extensive difference between the gene expression profiles of the two subtypes, some gene expression profiles of ER-positive patients were similar to profiles of ER-negative patients, which suggested that it is possible to predict tamoxifen sensitivity of ER-positive patients based on gene expression profile.Next, from a large integrated dataset of 420 normal controls and 1,129 ER-positive breast tumor samples, the gene pairs with stable relative ordering of expression measurement (ROE) in normal control and significantly reversed ROEs in ER-positive tumor were identified. In a cohort of 292 ER-positive patients who received tamoxifen monotherapy for 5 years, the gene pairs associated with relapse risk of these patients were identified. Based on a classification rule. i.e. patients with significantly more ROE reversed gene pairs are insensitive to tamoxifen. a optimized gene pair subset for predicting tamoxifen sensitivity of ER-positive patients was extracted by using a genetic algorithm. A predictor for predicting tamoxifen sensitivity of ER-positive patients was developed based on the gene pair subset and the classification rule. The performance of the predictor was then validated in 2 large multi-laboratory cohorts (N= 250 and 248, respectively) of ER-positive patients who received 5-year tamoxifen alone. In the first validation cohort, the patients predicted to be tamoxifen sensitive had a 10-year relapse-free survival (RFS) of 91%(95% confidence interval [CI],85%-97%) with an absolute risk reduction of 34%(95% CI,17%-51%). The patients predicted to be tamoxifen insensitive had a significantly higher relapse risk than the patients predicted to be tamoxifen sensitive (hazard ratio= 4.99,95%CI,2.45-10.17, P= 9.13e-07) with a 10-year relapse rate of 43%. Similar performance was achieved for the second validation cohort. After adjusting for traditional clinicopathologic parameters, the prediction result was still significantly associated with relapse risk of the patients in the two validation cohorts. Furthermore, the predictor performed well in both node-negative and node-positive subsets. Besides, the prediction was highly concordant in technical duplicate samples. In contrast,2 previously proposed absolute expression-based predictors have a relatively poorer performance in the 2 validation cohorts and a worse classification concordance in technical duplicate samples than the predictor proposed in this dissertation.In summary, this dissertation studied the genes dysregulated in both ER-positive and ER-negative breast cancers and built a predictors for predicting tamoxifen sensitivity of ER-positive breast cancer patients based on ROE. The proposed predictor can accurately and robustly predict tamoxifen sensitivity of ER-positive patients in independent validation cohorts and can identify the patients with a high probability of 10-year RFS following tamoxifen therapy. |