Mining Linguistic Tone Patterns Using Fundamental Frequency Time-Series Dat

Posted on:2018-10-05

Degree:Ph.D

Type:Dissertation

University:Georgetown University

Candidate:Zhang, Shuo

Full Text:PDF

GTID:1478390020457729

Subject:Linguistics

Abstract/Summary:

With the rapid advancement in computing powers, recent years have seen the availability of large scale corpora of speech audio data, and within it, fundamental frequency (ƒ0) time-series data of speech prosody. However, the wealth of this ƒ0 data is yet to be mined for knowledge that has many potential theoretical implications and practical applications in prosody-related tasks. Due to the nature of speech prosody data, Speech Prosody Mining (SPM) in a large prosody corpus faces classic time-series data mining challenges such as high dimensionality and high time complexity in distance computation (e.g., Dynamic Time Warping). Meanwhile, the analysis and understanding of speech prosody subsequence patterns demand novel analytical methods that leverage a variety of algorithms and data structures in the computational linguistics and computer science toolkits, prompting us to develop creative solutions in order to extract meaning in large prosody databases.;In this dissertation, we conceptualize SPM in a time-series data mining framework by focusing on a specific task in speech prosody: the analysis and machine learning of Mandarin tones. The dissertation is divided into five parts, each further divided into several chapters. In Part I, we review the necessary background and previous works related to the production, perception, and modeling of Mandarin tones. In Part II, we report the data collection used in this work, and we describe the speech processing and data preprocessing steps in detail.;Part III and IV comprise the core segments of the dissertation, where we develop novel methods for mining tone N-gram data. In Part III, we investigate the use of time-series symbolic representation for computing time-series similarity in the speech prosody domain. In Part IV, we first show how to improve a state-of-the-art motif discovery algorithm to produce more meaningful rankings in the retrieval of previously unknown tone N-gram patterns. In the next chapter, we investigate the most exciting problem at the heart of tone modeling: how well can we predict the tone Ngram contour shape types in spontaneous speech by using a variety of features from various linguistic domains, such as syntax, morphology, discourse, and phonology? The results shed light on the nature of how these factors contribute to the realization of speech prosody in tone production from an information theoretic perspective. In the final part, we describe applications of these methods, including generalization to other tone languages and developing softwares for the retrieval and analysis of speech prosody. Finally, we discuss the extension of the current work to a general framework of corpus-based large-scale intonation analysis based on the research derived from this dissertation.

Keywords/Search Tags:

Speech, Tone, Time-series, Mining, Large, Data, Patterns, Dissertation

Related items

1	Time Series Data Mining Based On Large Margin Theory
2	Research On The Structure Patterns Of The Multiple Time Series
3	Research On Time Series Data Mining Algorithm Of Partial Periodic Patterns
4	Research On Time Series Data Mining Of Partial Periodic Patterns
5	A Research And Application Of Mining Frequent Patterns Based On Time Series
6	Study On Water Quality Time Series Data Mining And Application Integration
7	Key Technologies Research On Time Series Data Mining For Large Scale Network Security Situation Analysis
8	Research On Data Mining And Forecasting Methods Over Time Series Data With Complex Structure
9	Research And Improvement Of An Algorithm For Time-Series Patterns Discovery
10	Time series retrieval: Indexing and mining large datasets