Font Size: a A A

The automatic prediction of prosodic prominence from text

Posted on:2009-06-09Degree:Ph.DType:Thesis
University:University of Colorado at BoulderCandidate:Brenier, Jason MFull Text:PDF
GTID:2448390002493621Subject:Language
Abstract/Summary:
Speakers of English can express critical discourse-pragmatic information in an utterance by marking certain words with intonational prominence. These pitch-accented words are realized acoustically with exaggerated fundamental frequency extrema and increased duration and intensity. In this dissertation, I present research on the automatic identification of prominent words using only information that can be automatically extracted from written texts or the transcripts of utterances.;Current computational models that classify prosodic prominence focus primarily on the automatic identification of pitch accents using a combination of acoustic and text features. Although some of these models have achieved reasonable performance, several challenges remain. First, most models require direct access to the speech signal and are therefore not applicable to text-based speech processing systems such as text-to-speech synthesizers. Second, the majority of pitch accent classifiers that do use text input are trained on automatically extractable features that capture only low-level linguistic knowledge. These classifiers ignore abstract linguistic features that are directly linked to the discourse meaning conveyed by prosodic prominence. Third, most existing prominence classification systems are designed for a single speech genre and are not robust to variation in input data. Last, prominence classifiers do not capture relative differences in pitch accent prominence and cannot adequately model discourse functions expressed through variation in accent strength.;In this research, I address these issues by developing a series of automatic prominence prediction models that are robust to variable input across genres, sensitive to abstract linguistic features, and optimized for making fine-grained distinctions in prominence without the need for acoustic input. I first show that prominence variation across genre can be attributed to a combination of factors including lexical choice, pitch accent ratio, word predictability, and intonational phrasing. I then identify accent ratio as the singlemost effective feature in a pitch accent predictor for spontaneous conversational speech and show that the performance of this predictor increases with the addition of a rich set of features that capture semantic contrast, information status, conversational topicality, and phonological form. Last, I describe an automatic prominence prediction model optimized for text-to-speech synthesis that is capable of determining the location and relative prominence level of pitch accents using only a set of linguistic text features. The model outperforms a baseline model of orthographic features by more than 40%. This research identifies important linguistic features that can be used to improve computational models of prosodic prominence and contributes to our understanding of how speakers assign prominence in speech.
Keywords/Search Tags:Prominence, Automatic, Models, Speech, Prediction, Text, Pitch accent, Features
Related items