Robust segmentation and retrieval of environmental sounds

Posted on:2011-05-07

Degree:Ph.D

Type:Dissertation

University:Arizona State University

Candidate:Wichern, Gordon

Full Text:PDF

GTID:1448390002462576

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The proliferation of mobile computing has provided much of the world with the ability to record any sound of interest, or possibly every sound heard in a lifetime. The technology to continuously record the auditory world has applications in surveillance, biological monitoring of non-human animal sounds, and urban planning. Unfortunately, the ability to record anything has led to an audio data deluge, where there are more recordings than time to listen. Thus, access to these archives depends on efficient techniques for segmentation (determining where sound events begin and end), indexing (storing sufficient information with each event to distinguish it from other events), and retrieval (searching for and finding desired events). While many such techniques have been developed for speech and music sounds, the environmental and natural sounds that compose the majority of our aural world are often overlooked.;The process of analyzing audio signals typically begins with the process of acoustic feature extraction where a frame of raw audio (e.g., 50 milliseconds) is converted into a feature vector summarizing the audio content. In this dissertation, a dynamic Bayesian network (DBN) is used to monitor changes in acoustic features in order to determine the segmentation of continuously recorded audio signals. Experiments demonstrate effective segmentation performance on test sets of environmental sounds recorded in both indoor and outdoor environments.;Once segmented, every sound event is indexed with a probabilistic model, summarizing the evolution of acoustic features over the course of the event. Indexed sound events are then retrieved from the database using different query modalities. Two important query types are sound queries (query-by-example) and semantic queries (query-by-text). By treating each sound event and semantic concept in the database as a node in an undirected graph, a hybrid (content/semantic) network structure is developed. This hybrid network can retrieve audio information from sound or text queries, and can automatically annotate an unlabeled sound event with an appropriate semantic description. Successful retrieval of environmental sounds from the hybrid network using several test databases is demonstrated and quantified in terms of standard information retrieval metrics.

Keywords/Search Tags:

Sound, Retrieval, Environmental, Segmentation, Network

PDF Full Text Request

Related items

1	Research On Environmental Sound Recognition Method Based On Deep Learning
2	Research On Complex Environmental Sound Recognition Based On Deep Learning
3	Application Research Of Environmental Sound Classification And Voiceprint Identification Based On Deep Learning
4	Research On Environmental Sound Recognition Algorithm Based On Neural Network
5	Environmental Sound Recognition Based On Deep Learning
6	Research On Environment Sound Event Localization System And Algorithm Based On WASN
7	Research On Environmental Sound Classification Method Based On Deep Learning
8	A Method Of Environmental Sound Classification Based On Residual Networks And Data Augmentation
9	Research On Environmental Sound Recognition Technology Based On Feature Fusion And Soft Attention Mechanism
10	Emotional Analysis System Based On HRV And DSV Of Heart Sound Signals