Research On Molecular Retrieval And Drug Candidates Recognition In Big Data Environment

Posted on:2017-01-11

Degree:Master

Type:Thesis

Country:China

Candidate:X Sun

Full Text:PDF

GTID:2308330503984352

Subject:Engineering, software engineering

Abstract/Summary:

PDF Full Text Request

In recent years, along with the sustainable development of our country national economy and the progress of chemical industry, the continuous development of combinatorial chemistry and high-throughput screening technology produce a large number of compounds. It can be synthesized a large number of diverse molecules in a short time. However, the molecular properties and functional properties get relatively slow. In some cases, it hinderes the research of some areas, including computational chemistry, chemical information, drug design. Traditional retrieval methods have achieved certain achievements, which can handle small- scale molecular data. However, with the explosive growth of existing molecules, the computing power of the traditional chemical software is limited, so the service rate of molecular data becomes the bottleneck. At the same time, as research of the optical materials and stealth materials focused on molecular refractive index. It is important practical significance to retrieve molecules about the refractive index. Finally, the way to choose high quality drug candidates is a research hotspot in drug research.We develop the research of molecular retrieval and recognition, the work mainly divided into two parts. In the first part, the traditional retrieval methods are analyzed in the big data environment. Attribute selection VF2 algorithm is proposed, and a distributed molecular retrieval model is established. The experimental results show, it realizes to effectively retrieve compounds with specific information in the big data environment. At the same time the retrieval complexity is lower. And we combine with the characteristics of the molecular properties. After the analysis of the classical efficient retrieval algorithm, the continuous refractive index is dispersed by the width algorithm, and then high-speed hash index is established, and the distributed massive retrieval system based on consistent hash function is realized. The calculated amount of data is effectively reduced to improve efficiency. The experimental results show, molecular data can be positioned fast, and the average time of this method is reduced. Besides, the model has the steady performance with high scalability.In the second part, there are 1555 molecules which are collected by us, including drug and non-drug. We have a further arrangement of the molecules from the database. First the molecular descriptors are analyzed. Then we ensure that valuable and non-redundant feature is left through pretreatment of molecular information. In addition, drug candidates recognition method uses deep belief network model based on molecular descriptors. The experimental results show that the method extracts the deeper characteristic vector, which is applicable to identify drug candidate task. The accuracy of recognition is up to 85.3% which is higher than the traditional methods such as support vector machine and artificial neural network.

Keywords/Search Tags:

Molecular retrieval, Properties pre-screening, Big data, Deep Brief Network, Feature extraction

PDF Full Text Request

Related items

1	Active learning for sequential screening and classification of molecular and genomic data
2	Research On Feature Extraction And Key Frame Retrieval Of Video GIS Data
3	Research On Image-text Retrieval Method Based On Deep Learning
4	Research On Video Retrieval Algorithm Based On Dual Network Model
5	Research On Image Retrieval And Image Semantic Feature Based On Deep Learning
6	Design Of Algorithms And Programs For Drug Discovery And Drug Targeted Virtual Screening
7	Research On Image Retrieval Based On Deep Convolutional Neural Network
8	Research On Resampling Methods Of Imbalanced Data Based On Data Screening
9	APP Similar Icon Retrieval System Based On Deep Learning
10	The Research Of Semantic Image Retrieval Basedon The Deep Convolution Neural Network