Nearest neighbour classification improves side effect machine performance for sequence categorization

Posted on:2011-07-22

Degree:M.Sc

Type:Thesis

University:University of Guelph (Canada)

Candidate:McEachern, Andrew

Full Text:PDF

GTID:2448390002453289

Subject:Mathematics

Abstract/Summary:

The explosion of available sequence data necessitates the development of sophisticated machine learning tools with which to analyze them. This thesis presents improvements on a sequence-learning technology called side effect machines, as well as some mathematical theory about a lower bound on resources. It also applies a model of evolution which simulates the evolution of a ring species to the training of the side effect machines. A comparison is done between side effect machines evolved in the ring structure and side effect machines evolved using a standard evolutionary algorithm. The core of the improvement for the training of side effect machines is a nearest neighbor' classifier. A parameter study was performed to investigate the impact of the division of training data into examples for nearest neighbour assessment and training cases. The parameter study demonstrates that parameter setting is important in the baseline runs but the ring optimization runs showed strong robustness to parameter change. The ring optimization technique was also found to exhibit improved and more reliable training performance. Side effect machines are tested on three types of synthetic data, one based on GC-content, one that checks for the ability of side effect machines to recognize an embedded motif and one created by self-driving Markov automata. Two types of biological data, a data set with different types of immune-system genes and a data set set with normal and retro-virally derived human genomic sequence, are classified with excellent accuracies...

Keywords/Search Tags:

Side effect, Sequence, Data, Nearest

Related items

1	Research On Algorithms For Discovering And Querying Sequential Pattern In Uncertain Sequence Databases
2	The Research Of Field-sensitive Side Effect Analysis For Java Programs
3	Research On Transient Thermal Effect Of LD Side Pumped Solid-State Laser Medium
4	Study On Large Capacity And High Performance Organic Nanocrystal Field Effect Transistor Memory
5	Research On Generation Of Single Side Band Signal Based On Stimulated Brillouin Scattering Effect
6	Side Chain Engineering In Conjugated Polymer Semiconductors For Organic Field-Effect Transistors
7	The Study Of Halftone PSM Side Lobe Effect
8	Study On The Efficient Approximate Nearest Neighbor Search For Massive Data
9	Efficient computation of k-nearest neighbor graphs for large high-dimensional data sets on gpu clusters
10	Research On K-nearest Neighbor Query Technology On High-dimensional Data