Font Size: a A A

Application Of Machine Learning In Arsenic Speciation Analysis And Toxicity Prediction

Posted on:2023-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z W LiFull Text:PDF
GTID:2531306809994919Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Arsenic is a ubiquitous metalloid,which causes environmental pollution and harms human health through natural processes or human activities.The morphological changes of arsenic will affect its transformation pathway and toxic effect for human.Therefore,analyzing,identifying arsenic speciation and toxicity is vital to understand the distribution,behavior,speciation,and toxicity mechanisms of different arsenic speciation in the environment.In this study,high performance liquid chromatography-inductively coupled plasma mass spectrometry(HPLC-ICP-MS)was used for speciation analysis of arsenic,and machine learning was used to predict the chromatographic retention time of other speciation of arsenic reported in the literature.It provides support for the qualitative and quantitative analysis of different speciation of arsenic compounds in environment and biological metabolites.In addition,toxicity test for cells in vitro was used to determine the inhibitory activity of different speciation of arsenic compounds on hepatocellular carcinoma cells(Hep G2)and bladder cancer cells(T24),and machine learning was used to predict the cytotoxicity of other arsenic speciation reported in the literature on Hep G2 and T24,so as to provide a basis for the study of the metabolism and distribution of different speciation of arsenic in the body.The research work of this paper is mainly divided into the following three parts:(1)Predicting the retention time of arsenic speciation in HPLC with machine learning.The toxicity of arsenic is closely related to their speciation.The qualitative analysis and quantitative detection of arsenic by HPLC-ICP-MS.However,there are many speciation of arsenic in the environment and few standards,which brings difficulties to the analysis of arsenic speciation lacking standards.Based on this,we using Thermo U3000 HPLC(Dionex Ion Pac AS7)and ICP-MS(Thermo i CAP QC)with ammonium carbonate as mobile phase to separate 22 different speciation of arsenic standards within 20 minutes,including inorganic arsenic,phenylarsine and sulfur-containing arsenic.We found that the retention time of phenylarsine was longer than that of inorganic arsenic.The quantitative structure-activity relationship(QSAR)models constructed with different descriptors were evaluated,and the effects of different molecular descriptions and model performance were compared.Finally,the Mordred descriptors(Sp Max_A、Mor02m、AATSC2i、Mor29m)was selected and were used to construct the model.The adaptive boosting(Ada Boost)algorithm was used to optimize the performance of the model.The QSAR model with the best prediction performance(R2Test=0.95,MAE=34)was established.The HPLC retention time of 20arsenic compounds reported in the literature was predicted by using the model.It provides support for the speciation analysis of unknown arsenic compounds in the environment.(2)Predicting the inhibitory activity of arsenide on liver cancer cells(Hep G2)with machine learning.Arsenic compounds in the environment enter the body in many ways,and a series of biological transport and transformation take place in the body.Liver is one of the important target organs of arsenic carcinogenesis.The study of the toxicity of different speciation of arsenic to Hep G2 cells can provide a research basis for further clarifying the mechanism of arsenic induced liver injury.Through the toxicity experiment of 19 different speciation of arsenic standards on Hep G2 cells,it was found that phenylarsenic oxide(PAO)was the most toxic,and the toxicity of trivalent arsenic was generally greater than that of pentavalent arsenic.The MLR models constructed by different descriptors were evaluated to compare the effects of different molecular descriptions and model performance.Finally,the Alva Desc descriptors(MATS6e、VE2sign_Dz(Z)、GATS1m、Eig10_AEA(bo)、totalcharge)was selected and were used to construct the model.the random forest(RF)algorithm was used to optimize the performance of the model.The QSAR model with the best prediction performance(R2Test=0.81,MAE=0.42)was established.The model was used to predict the toxic effects of18 arsenic compounds on Hep G2 cells reported in the literature,It provides useful information for the toxic effects of different speciation of arsenic containing compounds on Hep G2 cells.(3)Predicting the inhibitory activity of arsenide on bladder cancer cells(T24)with machine learning.Bladder is one of the main organs for the metabolism of arsenic compounds in vivo,and it is also one of the main accumulation organs and toxic target organs of arsenic compounds in vivo.The study of arsenic induced urinary toxicity has attracted wide attention from researchers.Studying the inhibitory activity of different speciation of arsenic on bladder cancer cells(T24)is important for understanding the mechanism of arsenic induced urinary tract injury.Through the toxicity experiment of 19 different speciation of arsenic standards on T24 cells,the order of toxicity was i AsIII>i As>MMA=DMA,and arsenobetaine(As B),arsenocholine(As C)and 4-hydroxyphenylarsonic acid showed no significant toxicity.The MLR models constructed by different descriptors were evaluated to compare the effects of different molecular descriptions and model performance.The five descriptors gats1z,atsc3i and jgi3 screened by Mordred were selected to build the model,and the performance of the model was optimized by using the catboost regression(Catboost)algorithm.The QSAR model with the best prediction performance(R2Test=0.79,MAE=0.65)was established to predict the toxic effects of different speciation of arsenic compounds on T24 cells,The results showed that arsenic speciation had inhibitory activity on T24 cells.
Keywords/Search Tags:Arsenic, Machine learning, HPLC-ICP-MS, HepG2, T24
PDF Full Text Request
Related items