Font Size: a A A

Early Diagnosis Of Hepatocellular Carcinoma Using Machine Learning Method

Posted on:2022-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z M ZhangFull Text:PDF
GTID:2504306524982489Subject:Biophysics
Abstract/Summary:PDF Full Text Request
Hepatocellular carcinoma(HCC)is a common malignant tumor and it is the third cause of cancer-related death worldwide.At present the early detection methods of HCC mainly include serum biomarkers and imaging techniques.The sensitivity of serum biomarkers is not high enough to identify more than a third of HCC patients.The diagnostic sensitivity of imaging techniques for histologically well-differentiated tumors with diameter smaller than 2 cm is about 50%.Thus,biopsy is often used to identify suspicious diseases in early liver cancer tissue that cannot be identified by imaging techniques.Nevertheless,it is not entirely reliable due to insufficient sampling amount and inaccurate sampling location.As a matter of fact,more than 80% of HCC develop from cirrhosis.Although several existing methods are available for distinguishing HCC from cirrhosis tissues of non-HCC(cirrhosis tissues in patients without HCC,Cwo HCC),the accuracy of their prediction is far from satisfactory.Therefore,more accurate diagnostic models are urgently needed to aid the early HCC diagnosis under clinical scenarios and thus improve HCC treatment and survival.The gene expression profile datasets of 1091 HCC samples and 242 Cwo HCC samples from different laboratories were used in this study.The within-sample relative expression orderings(REOs)method was used to obtain gene pairs whose REO patterns kept highly stable in more than 95% of the sample.And if the REO pattern is kept in at least 95% of HCC samples,but reversed in at least 95% Cwo HCC samples,then these reverse gene pairs are refered as the candidate signatures for early diagnosis of HCC.We then used minimum redundancy maximum relevance(m RMR)and incremental feature selection to remove the irrelevant features,and finally obtained a signature composing of 11 gene pairs.The accuracy of this signature in the training set was 100%by using the classification algorithm of support vector machine(SVM)and 5-fold cross-validation.We further investigated the HCC recognition ability of 11 gene pairs(TRMT112 and SF3B1,MFSD5 and COLEC10,FDXR and APC2,LAMC1 and CHST4,UBE4 B and HGF,NCAPH2 and APC2,HSPH1 and MTHFD2,TMEM38 B and AGO3,PLGRKT and COLEC10,HNF1 A and APC2,ARPC2 and SF3B1)on several independent datasets.For biopsy sample data,all(100%)of 99 HCC samples and all(100%)of 44 Cwo HCC samples were correctly classified by this signature.Specifically,all(100%)of 97 normal tissues in patients with HCC and all(100%)of 80 cirrhosis tissues in patients with HCC were classified to HCC.For surgical resection specimens,89.63% of 926 HCC samples and all(100%)of 18 Cwo HCC samples were correctly classified.At the same time,93.7% of 254 cirrhosis tissues in patients with HCC and all(100%)of 644 normal tissues in patients with HCC were classified to HCC.The results showed that 11 gene pairs could be used as signatures for the early HCC diagnosis.This signature can distinguish HCC and its adjacent tissues(normal tissues in patients with HCC or cirrhosis tissues in patients with HCC)from Cwo HCC samples even for minimum biopsy specimens and inaccurately sampled specimens,which can be practical and effective for aiding the early HCC diagnosis at individual level.
Keywords/Search Tags:hepatocellular carcinoma, early diagnosis, relative expression orderings, minimum redundancy maximum relevance, support vector machine
PDF Full Text Request
Related items