Font Size: a A A

Detection of irregular phonation in speech

Posted on:2008-03-30Degree:M.SType:Thesis
University:University of Maryland, College ParkCandidate:Vishnubhotla, SrikanthFull Text:PDF
GTID:2448390005475026Subject:Engineering
Abstract/Summary:
The problem addressed in this work is that of detecting and characterizing occurrences of irregular phonation in spontaneous speech. While published work tackles this problem as a two-hypothesis problem only in those regions of speech where phonation occurs, this work also focuses on trying to distinguish aperiodicity due to frication from that arising due to irregular voicing. In addition, this work also deals with correction of a current pitch tracking algorithm in regions of irregular phonation, where most pitch trackers fail to perform well, as evidenced in literature. Relying on the detection of such regions of irregular phonation, an acoustic parameter is then developed in order to characterize these regions for speaker identification applications. The algorithm builds upon the Aperiodicity, Periodicity and Pitch (APP) detector, a system designed to measure the amount of aperiodic and periodic energy in a speech signal on a frame-by-frame basis. The detection performance of the algorithm has been tested on a clean speech corpus, the TIMIT database, and on telephone speech corpus, the NIST 98 database, where regions of irregular phonation have been labeled by hand. The detection performance is seen to be 91.8% for the TIMIT database, with the percentage of false detections being 17.42%. The detection performance is 89.2% for the NIST 98 database, with the percentage of false detections being 12.8%. The corresponding pitch detection accuracy increased from 95.4% to 98.3% for the TIMIT database, and from 94.8% to 97.4% for the NIST 98 database, on a frame basis, with the reference pitch coming from the ESPS pitch tracker. The creakiness parameter was added to a set of seven acoustic parameters for speaker identification on the NIST 98 database, and the performance was found to be enhanced by 1.5% for female speakers and 0.4% for male speakers for a population of 250 speakers. These results lead to the conclusion that the creakiness detection parameter can be used for speech technology. This work also has potential applications in the field of non-intrusive diagnosis of pathological voices.
Keywords/Search Tags:Speech, Irregular phonation, Detection, Work, TIMIT database, NIST
Related items