Font Size: a A A

Gene Name Recognition Feature Selection Methods In Biomedical Research Text

Posted on:2015-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:R J ZhangFull Text:PDF
GTID:2268330431451265Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The scientific knowledge about biomedical that contained in biomedical literature is continually being updated. Valuable information such as gene-disease relations or gene-protein relations can be extracted from the scientific knowledge. Therefore, identification and classification of gene names is a prerequisite for automatic extraction of relations.Machine learning methods especially feature-based models are attractive in gene names recognizing, these methods suppose the feature set that represent the task are useful, but the association between features is ignored. We devoted most of our efforts to solve this problem, and our work is as follows.Multiple types of feature are combined based on Global Linear framework. These features are incorporated into perceptron model to identify gene names, and the feature set that have strong ability in gene names recognizing is found. While using the feature set, labels and trigram are the main part of the set, orthographic features and character types are affiliated. Labels features and orthographic features are selected from the feature set. Decision tree model is applied in gene names recognizing with the same feature set. Based the same data set, the results of the two models are compared, decision tree model obtain a higher precision. Then prune the decision tree, and compare the results with pre-prune tree. After prune, the precision of gene names identification continued improved to some extent.
Keywords/Search Tags:feature selection, gene name identification, decision tree
PDF Full Text Request
Related items