Font Size: a A A

Research On Chinese Newscast Speech Summarization

Posted on:2008-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:T K WangFull Text:PDF
GTID:2178360245998022Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays there are a mass of speech resources. Thus it should be explored and utilized to extract the abundant information included in speech files. Speech summarization is a very useful technique and can be used for information retrieval, browsing and recording and so on. How to extract important sentences which can represent the main idea of an article from speech files is a problem we should consider carefully. The main goal of this thesis is to study some kinds of speech summarization techniques and to find the more relative features for documents topic and the more effective ways of speech summarization.In the first part of this thesis, the ability of some kinds of techniques applied in speech summarization is discussed. Then we try to analyze the statistic features and prosody features. And at last we get some reliable features including Term Frequency, Inverse Documents Frequency, F0, mean power. For revealing the ability to speech summarization of statistic features, a summarization way which combines the Large Vocabulary Continuous Speech Recognition(LVCSR) technique with Latent Semantic Analysis(LSA)-based text summarization methods is proposed. And we get a satisfying result. Moreover, we try to use the F0 contour to extract prosodic phrases and utilize aggression lines to extract summarization. Then the dependency structure of the sentences is analyzed and three choosing method are utilized to form the final summary.
Keywords/Search Tags:Speech summarization, Vector Space Model, Latent Semantic Analyze, prosody feature
PDF Full Text Request
Related items