Feature extraction for audio classification

Posted on:2004-08-04

Degree:M.A.Sc

Type:Thesis

University:Carleton University (Canada)

Candidate:Abu-El-Quran, Ahmad Rami

Full Text:PDF

GTID:2468390011460539

Subject:Engineering

Abstract/Summary:

This thesis proposes a new algorithm to discriminate between speech and non-speech audio segments. It is intended for security applications as well as talker location identification in audio conferencing systems, equipped with microphone arrays.; The proposed method is based on splitting the audio segment into small frames and detecting the presence of pitch in each one of them. The ratio of frames with pitch detected to the total number of frames is defined as the pitch ratio and is used as the main feature to classify speech and non-speech segments. The performance of the proposed method is evaluated using a library of audio segments containing female and male speech, and non-speech segments such as cocktail noise, footsteps, and traffic noise.; Two major contributions are proposed for the speech/non-speech discrimination. The pitch ratio algorithm is proposed to give robust decision for each class. Limiting the search to the human pitch range is also proposed to enhance the performance of the pitch ratio algorithm. Also, non-speech audio type classification using neural networks is proposed.; It is shown that the proposed algorithm can achieve a percentage of correct decision of 97% for the speech and 98% for non-speech segments, 0.5-seconds long.

Keywords/Search Tags:

Audio, Non-speech, Segments, Algorithm

Related items

1	Research On Special Speech Retrieval Technology Through Short-Speech Segments
2	Based On An Audio Match Of The Smart Broadcast Advertisements
3	Research On Unified Speech And Audio Coding Algorithm
4	Key Technology Research On Audio Information Hiding And Information Security Application For Speech Recognition
5	Rearching An Algorithm On ESP Problem Of Visiting Disjoint Segments In The Plane
6	Digital Speech Rhythm Analysis Based On Segments Of Speech
7	Research On Automatic Speech-Text Alignment For Mongolian Long Audio
8	Robust and efficient techniques for audio-visual speech recognition
9	Research And Implementation Of Audio Quality Evaluation And Speech Recognition Preprocessing Technology
10	Research On Two Typical Speech Processing Applications Based On Deep Learning