Font Size: a A A

Automatic recognition of harmonic bird sounds

Posted on:2006-10-27Degree:Ph.DType:Dissertation
University:State University of New York at Stony BrookCandidate:Heller, Jason RobertFull Text:PDF
GTID:1458390008967932Subject:Mathematics
Abstract/Summary:
This dissertation analyzes the vocalizations of several common bird species: herring gull, bluejay, American crow, and Canada goose. It qualitatively analyzes the spectrograms of vocalizations of these species and then quantitatively analyzes the vocalizations using frequency track analysis. A frequency track is the path travelled by a peak in the DFT spectrum of a segment of a sound file as the segment is shifted forward in time. There are two parts to any pattern recognition system: (1) producing statistical models for each pattern to be recognized; and (2) using the models to find the patterns in real test data. The training procedure for this research requires hand selection of the frequency tracks corresponding to each training vocalization in the training data set using the frequency track analysis capabilities of Gtkvis. Following the formation of the frequency track files for each vocalization instance, a statistical model of the vocalization is created using the MakeModel() function. The recognition algorithm extracts sets of frequency tracks that closely approximate harmonic sounds in the sound file being processed. The set extraction function GetSets() uses a preliminary set extracting function called FindFeasibleSets() followed by a function that refines the "feasible sets" called FindMaximalSubsets(). Each extracted set in its final form is then compared with the statistical models generated during the training phase. If it matches one of the models closely, the recognizer declares the set is an occurrence of the corresponding vocalization.; The final result is a hardware and software implementation of a complete sound recognition system based on a methodology easily adapted to a wide class of vocalizations. One set of hardware consists of a handheld digital recorder, microphone, and pre-amp. The other set of hardware (in development) consists of an array of microphones coupled to a steerable parabolic dish microphone. The software consists of a sound visualization and processing application called Gtkvis and some command line tools for training and recognition that could easily be integrated into the Gtkvis GUI.
Keywords/Search Tags:Recognition, Sound, Training, Frequency track, Vocalizations
Related items