Font Size: a A A

Research On Small Molecule Screening Method For Unstructured Mass Drug Data

Posted on:2021-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Q F XuFull Text:PDF
GTID:2404330623973511Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Biochemistry,the relevant biomedical data are growing exponentially.There are more and more small molecules of natural drugs and synthetic drugs.PubChem and other commonly used drug databases involve nearly 100 million small molecules.It is an important research content in the field of drug design to select small molecule compounds that meet the requirements from drug databases.However,the large amount of unstructured data of small molecular compounds makes the screening of lead compounds difficult in the development of new drugs.Traditional experimental screening has many problems,such as time-consuming,labor-consuming and cost-effective.Therefore,using computer technology to solve the problem of drug small molecule screening has become a hot topic in the field of biomedical research.However,there are two main challenges in using computer technology to screen small drug molecules,one is the non structural of drug molecules,the other is the mass of drug molecules.Therefore,how to effectively select small drug molecules with similar molecular fragments from unstructured and massive drug data is a hot topic in the field of bioinformation research.In this dissertation,the above problems are studied,and the following research results are obtained:1)In order to facilitate computer screening,on the one hand,how to deal with the unstructured drug formula structurally is studied,on the other hand,the image information of the unstructured drug formula is studied.Based on the screening of hypoglycemic molecules in the State Key Laboratory of Biotherapy,Sichuan University,the corresponding computer screening method was proposed.2)A drug molecule screening method based on 2D model is proposed.Firstly,input the unstructured text data of molecular fragments and small molecule database,use the link relationship between atom and atom to convert it into structured data,and the transferred data is used as the input comparison small molecule information and small molecule data set;secondly,calculate the small molecule information in small molecule data set and the inclusion degree of comparison small molecule;then,the small molecule data set that meets the requirements is output;finally,the visualization processing is carried out to verify the correctness of the output results.The experimental results show that SMS-2D can effectively screen out small molecule compounds containing specific molecular fragments from massive text data by transforming unstructured text data into string data easy to be processed by computer,and it can screen out small drug molecules containing different similarity with specific sub matching according to the threshold value..3)A screening method based on image matching is proposed.Firstly,input the molecular fragment data set and the small molecule image data set;secondly,obtain the image in the data set and process the image matrix;then,calculate the number of corresponding pixels in the coverage area of the molecular fragment image and the small molecule image;finally,if the value proportion is greater than or equal to the threshold value,the small molecule image is deemed to contain the molecular fragment,and the image is output and saved to the data Set.The experimental results show that SMS-IM method can effectively screen out small drug molecules containing specific molecular fragments from massive unstructured image data.
Keywords/Search Tags:small molecules screening, drug development, mass data, virtual screening, image matching
PDF Full Text Request
Related items