Font Size: a A A

Identifying Biomarkers For Cancer Prognosis And Early Detection Based On Ranks Of Gene Expressions

Posted on:2015-02-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:1224330473956161Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
With the wide application of genome-wide DNA microarray, numerous biomarkers based on gene expression profiles provide important aids to cancer prognosis prediction and early detection. However, because of the batch effect of microarray measurements and heterogeneity of cancer patients, the predictive performances of biomarkers based on gene expression profiles often decrease greatly in inter-laboratory validation. To tackle this problem, this dissertation proposed a general rank-based algorithm. Because of the abundant data of breast cancer chemotherapy and early hepatocellular carcinoma, we applied this algorithm to the prognosis prediction for breast cancer patients and the early detection of hepatocellular carcinoma.The main contributions are as follows:1. The robutstness evaluation of cancer biomarkers. For the same cancer type, biomarkers based on microarray profiling from different laboratories are highly inconsistent and not robust in classification performance. Based on certain reasonable assumption(or molecular model), we evaluate the robustness of the most significant differentially expressed genes that are separately identified as diagnosis or prognosis biomarkers in different datasets in terms of functional relation and classification performance. Our results supported the assumption that the most significant differentially expressed genes that are separately identified in different datasets for a particular type of cancer tend to be significantly coexpressed and closely connected in an active protein-protein interaction subnetwork associated with the cancer. The active protein-protein interaction subnetwork built based on this functional relationshiop assumption are able to distinguish cancer samples from the control ones in datasets across laboratories.2. Response prediction for breast cancer treated with neoadjuvant chemotherapy. For neoadjuvant taxane and anthracycline–based chemotherapy for breast cancer, patients with pathological complete response(pCR) have a favorable overall survival compared with patients with residual disease(RD). Therefore, a number of pCR predictors based on gene expression profiles have been proposed to guide neoadjuvant chemotherapy, most of these pCR predictors have not been independently validated in inter-laboratory datasets. To generate a robust pCR predictor, we developed a CTSP(Combinational top scoring pairs) method based on relative expression orders. Firstly, we extracted gene pairs that had opposing relative expression orders in patients with pCR and those with RD. Then, based on certain decision rules, we generated the pCR predictor using the combination of these gene pairs. This pCR predictor was found to have sensitivities of 74% and 86% and specificities of 71% and 68% in another two independent datasets from multiple laboratories, and these results were much better than the performances of three previously reported predictors.3. Prognosis prediction for breast cancer treated with neoadjuvant chemotherapy. Considering that the pCR rate is quite low and patients with minimal RD also tend to have a good prognosis, we then developed a prognosis predictor as a complement to the pCR predictor. Because the intrinsic risk factors such as estrogen receptor, pathology stage also have impact on patient’s prognosis after chemotherapy, we proposed to predict prognosis through the combination of risk factors and the residue of tumor after chemotherapy. First, according to the CTSP algorithm, we could predict residual level of tumor after chemotherapy. Then, by using the COX regression analysis, we were able to obtain the risk indicator for each patient through the combination of predicted residual level of tumor after chemotherapy and the clinical variables. The results showed that the risk indicator could effectively divide the patients into the good and poor prognosis group, the difference of 3-year DRFS(distant disease free survival) rate was 17% between these two groups and these two groups also showed significantly distinguishable survival(log-rank test, p=0.001).4. Early detection of hepatocellular carcinoma in high-risk population. Small equivocal liver lesions detected by imaging techniques need biopsy confirmation in early diagnosis for hepatocellular carcinoma(HCC). However, pathologic change can be indistinguishable for pathologists and one cannot be certain that the sample did indeed come from the tumor lesion in biopsy when lesion is so small in early HCC. To tackle the difficulties in diagnosis of early hepatocellular carcinoma, we proposed a method to identify early HCC and precancerous lesions based on the relative gene expressions of cirrhosis tissue adjacent to HCC. Firstly, gene pairs with opposite relative expression orders in cirrhosis tissues between patients with and without HCC were detected, from which pairs having the same relative expression orders in both carcinoma and cirrhosis tissues in patients with HCC were then filtered out using large scale HCC samples. Finally, the predictor for discriminating HCC tissues and cirrhosis tissues adjacent to HCC from the cirrhosis tissues in patients without HCC was built based on the filtered pairs. The results showed that our classifier showed robust classification performances in datasets extracted from different platforms and different laboratories, which could serve as an effective tool to aid early HCC diagnosis.In conclusion, this dissertation developed a CTSP algorithm to solve the problem of unstable performance of gene expression biomarkers in inter-laboratory validation from two aspects: On the one hand, relative expression orders were used as robust biomarkers instead of gene expression measurements, because the relative expression orders within each sample are not affected by individual differences of detection condition and linear inter-array normalization. On the other hand, using the stable relative expression orders in large sample control group(e.g., normal group, RD group) as reference, it’s able to identify relative expression changes that only take place in some samples of the observation group(e.g., tumor group, pCR group), this provides new perspective and method to applications(e.g., chemotherapy response prediction) with high biology variance under complex disease conditions.
Keywords/Search Tags:prognosis biomarker, neoadjuvant chemotherapy, early diagnosis, microarray gene expression profile, cancer
PDF Full Text Request
Related items