Font Size: a A A

Classifying influenza subtypes and hosts using machine learning techniques

Posted on:2011-08-15Degree:M.SType:Thesis
University:University of Nebraska at OmahaCandidate:Attaluri, Pavan KumarFull Text:PDF
GTID:2448390002966814Subject:Biology
Abstract/Summary:
Recent advances in machine learning techniques have made its way into a wide variety of fields in an impressive way. There has been a tremendous amount of research going on the improvement of various machine learning methods. However, the study of utilizing machine learning techniques in a systematic way is meager. In this research, we explored a methodology for integrated use of various machine learning techniques for influenza analysis.;Influenza is one of the most important emerging and reemerging infectious diseases, causing high morbidity and mortality in communities and worldwide. Classification and prediction analysis helps to better understand the evolution of influenza virus and developing tools for detection of new viral strains. The main objective research is to classify the influenza A virus sequences based on host of origin and subtype, for which decision tree analysis, support vector machine (SVM) and artificial neural networks have been applied. A Web based tool is developed using hidden Markov model (HMM) for accurate prediction of origin and subtype.;With decision tree analysis, the accuracies of classification results varied between 93-97%. Informative positions are extracted from decision trees and modeled into profiles through hidden Markov modeling. These profiles are used in the Web prediction system. The host and subtype prediction system achieved 88% accuracy. With support vector machine analysis, the accuracies of classification results varied between 96-98%. With neural networks, the accuracies of classification results varied between 88-94%. Mutation positions are found through studying the informative positions determined by the decision tree method at protein level and stored in a database.;This project paves the way for further experiments to examine the informative positions at protein level, extend its current functionality to classify more subtypes and host origins and investigate other advanced machine learning algorithms. Developing a Web tool for the prediction of all influenza A hosts and subtypes has significances in the development of a computational system for influenza detection and surveillance.
Keywords/Search Tags:Machine learning, Influenza, Host, Subtypes, Classification results varied, Way
Related items