Font Size: a A A

Intonation recognition models based on convex optimizations

Posted on:2007-08-30Degree:Ph.DType:Thesis
University:Stanford UniversityCandidate:Maghbouleh, ArmanFull Text:PDF
GTID:2458390005981221Subject:Language
Abstract/Summary:
This thesis presents a set of computational models for automatic recognition of ToBI intonation labels. We used the Boston University Radio News Corpus to train and test models for predicting vowel durations, recognizing prominent syllables (*-labeled ToBI accents), isolating prosodic fundamental frequency effects, and for distinguishing ToBI accent types. The developed models rival human performance in distinguishing H*, L* and !H* ToBI accents. In addition, we show that neither our models nor expert human labelers could consistently distinguish bitonal accents, such as L+H*, from their single-toned counter parts, such as H*.; The models presented here embody a new approach to model selection in computational linguistics. Traditional methods use the number of predictors in the model as a measure of model simplicity (e.g., "find the model with the highest accuracy using combinations of eight predictors"). Here we use alternative measures of model simplicity that result in convex optimization problems (e.g., "find the highest accuracy linear regression model with limited sum of squares of coefficients"). The complexity of traditional combinatorial formulations grows exponentially with regard to the number of predictors, and hence optimal solutions are possible only in very special cases. The complexity of the convex formulations used here grows cubically, at worse, and all of the problems approached here can be practically solved with today's computational resources.; The traditional approaches are also theoretically problematic because they force distinctions between predictors when in reality there may be no evidence or no requirement for choosing one predictor over another. For example, it may not be possible to choose between two highly correlated predictors such as power and duration in recognition of prominence. Instead of forcing a binary decision, our models allow for the quantification of the relevance of each predictor. In the case of Ridge and Lasso regression models presented here, the quantification has attractive probabilistic interpretations in terms of prior beliefs.; The regression methods presented here are directly applicable to modeling language variation where regression models are currently used. The approach of using convex optimizations is more generally applicable to areas, such as syntactic parsing, where combinatorial search methods are prevalent.
Keywords/Search Tags:Models, Convex, Recognition, Tobi
Related items