A study on the use of conditional random fields for automatic speech recognition

Posted on:2011-03-18

Degree:Ph.D

Type:Dissertation

University:The Ohio State University

Candidate:Morris, Jeremy J

Full Text:PDF

GTID:1468390011472073

Subject:Computer Science

Abstract/Summary:

Current state of the art systems for Automatic Speech Recognition (ASR) use statistical modeling techniques such as Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) to recognize spoken language. These techniques make use of statistics derived from the acoustic frequencies of the speech signal. In recent years, interest has been rising in the use of phonological features derived from these acoustic frequency features in addition to, or in place of, the acoustic frequency features themselves. These phonological features are derived from the manner that speech is physically produced in the vocal tract of the speaker, rather than models of how speech is heard by the listener.;Integrating phonological features into ASR models presents new challenges. The mathematical assumptions made to build current models may work well for features derived from acoustic frequencies, but do not necessarily fit phonological features as nicely. Explorations into how to alter the mathematical models to allow for this new type of input feature is an ongoing area of ASR research. This dissertation examines the use of the statistical model known as a Conditional Random Field (CRF) for ASR using phonological features. CRFs are statistical models of sequences that are similar to HMMs, but CRF models do not make any assumptions about the independence or interdependence of the data being modeled.;This dissertation provides (1) a CRF-based pilot system is able to achieve superior performance in a phonetic recognition task to a comparably configured HMM model, and achieve this performance with many fewer parameters, (2) an extension of this model to create new features for an HMM-based system for word recognition, and (3) a fully developed system for word recognition using CRFs.

Keywords/Search Tags:

Recognition, Speech, Features, Models, ASR, System

Related items

1	Speech Emotional Recognition Research Based On Features Extraction And Multi-modal Combination
2	Projection On Speech Features Space Improves The Performance Of Speaker Identification
3	Research On Speech Phoneme Recognition Based On Deep Learning
4	The Research Of Dimensional Speech Emotion Recognition Based On Neural Network And Fusion Features
5	Research On The Acoustic Models And Implemention On The Keyword Recognition System
6	Research For Algorithm Of Speech Recognition Based On WD/HMM
7	Research On Lightweight Speech Recognition Technology In Noise Environment
8	Research On The Performance Of Speech Features In Gender-based Speaker Recognition
9	Research On Emotion Recognition From Speech-Features And Models
10	Exploring deep learning methods for discovering features in speech signals