Font Size: a A A

Data Mining Method Research For Gene Screening Of Breast Cancer And Drug Repositioning

Posted on:2017-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:X TianFull Text:PDF
GTID:2284330485970868Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
The gene screening and drug development of breast cancer have long been treated as important fields for the medical research. The selection of related gene and the prediction of new indication of drugs have scientific significance for the therapeutic methods of this disease. However, it is difficult to mine and to implement the feature information between drugs and diseases in the field. As the data mining technology develops, the feature integrate and the algorithms of models can provide new possible solutions. Therefore, this paper applies feature selection algorithms of data mining and classification methods to study the gene screening of breast cancer and drug repositioning. The main contents of this paper are showed as following:1. We propose a new algorithm based on gene feature and protein-protein interaction information to identify important genes highly related with breast cancer metastasis, which called as PPIRF. The specialty of this method does not only incoprate the variable importance of gene expression into the classifiers, but also utilize the biological prior knowledge. The new gene selection algorithm is proved to have better performance and to be effective in picking out the gene sets more explanatorily.2. We introduce the Ranking-based KNN approach combined with various kinds of drug profiling, which can predict new indictation of drugs. The specialty of this method is not only combined with four types of information, including the chemical structural similarity, the sequence similarity, the side-effect similarity and the topology similarity, but also get the most trusted neighbours with the Ranking-SVM algorithm to predict new indications of test drugs. This method can help find new application of drugs that can treat the breast cancer disease and so on.3. We develope a visualization tool, called as DREP, for the prediction of drug repositioning. The tool includes two kinds of drug reposition methods:the first one is based on the Ranking-based KNN algorithm; the second is based on the logical regression classification. The dataset of DREP tool contains 1387 drugs,1514 kinds of diseases and 3375 kinds of known drug-disease relationships obtained from KEGG database. The design objective is to use the known drug features, the disease features and the known drug-disease interactions for the prediction of unknown relationship. The interface of DREP is friendly and easy to operate, which can provide more conveniences for the biological researchers.
Keywords/Search Tags:data mining, feature selection, GeneRank, random forest, drug reposition, DREP tool
PDF Full Text Request
Related items