Font Size: a A A

Non-negative Sparse Signal Decomposition And Monaural Sound Separation

Posted on:2007-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:2208360185455767Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the information era, the task of solving the problem of machine understanding the human speech is gradually becoming a reality with the development of technology in speck recognition. Speech signal, carrying a large amount of information, is so complex with the character of non-stationary and time-varying.Not only is speech recognition a theoretical problem, but also an engineering problem. It integrates theoretic achievements of many disciplines, for example, acoustics, phonetics, linguistics, physiology, digital processing, information engineering, communication theories, electronic technology, computer science, pattern recognition and artificial intelligence. At the beginning of this thesis, we introduce the fundamental of the acoustics and the perceptual mechanism. Next, different kinds of speech processing methods including time processing and time-frequency analysis are presented, such as Short Time Average Energy, Short Time Cross Zero Analyses, Short Time Autocorrelation Function Analyses and FFT. At last, we focus on the sound separation, especially on single channel sound separation.In real-world audio signals several sound sources are usually mixed. The process in which individual sources are estimated from the mixture signal is called sound separation. The human ear has the ability to efficiently separate necessary speech signals from a plethora of other auditory signals, even if these signals have similar overall frequency characteristics, and are perfectly coincident in time. Computational modeling of this ability is very difficult. The problem of source separation - separation of one or more desired signals from mixed recordings of multiple signals - has traditionally been approached by using multiple microphones, in order to obtain sufficient information about the incoming speech signals to perform effective separation. Typically, no prior information about the speech signals is assumed, other than that the multiple signals that have been combined are statistically independent, or uncorrelated with each other. The problem is treated as one of Blind Source Separation (BSS), which can be performed by techniques such as Independent Component Analysis (ICA). In this thesis, a data-adaptive technique very similar to ICA called SNMF (Non-negative...
Keywords/Search Tags:Single channel sound separation, SNMF, Sound separation, FFT, Neural Network
PDF Full Text Request
Related items