Font Size: a A A

Research Of Mixed Speech-music Separation Based On Blind Source Separation Algoirthm

Posted on:2015-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:W GuoFull Text:PDF
GTID:2298330431490451Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech-music separation is to separate speech and music from the mixed signals. Theseparated signals can be used for audio processing such as speech recognition, musicalinstrument recognition, music melody extraction and music genre classification. Blind sourceseparation can solve the process of extracting single source from mixed signals effectively.This paper researched on speech-music separation with linear instantaneous mixture, studiesthe negentropy maximization, blind source separation based on time-frequency ratio andinformation maximization. The specific research works are as follows:First, positive definite speech-music separation algorithm based on improved negentropymaximization is researched. Initial matrixes have a great influence on separation performance.Newton downhill method is used instead of Newton iteration to find the optimal matrix. TheNewton downhill reduces the dependence of the initial value by changing the downhill factorthat makes the objective function on a declining trend. The simulation experiment resultsshow that the proposed method can separate mixed signals of speech and music better underdifferent initial values. The method can reduce26.44%of time and69.15%of iterations. Atthe same time, the iterations and time change in a small range. Thus the Newton downhillmethod solves the initial value sensitivity problem effectively.Second, positive definite speech-music separation algorithm based on improvedtime-frequency ratio is researched. The original algorithm has a large amount of calculationand small number of effective time-frequency windows. Analysis zones including a repeatingperiod are used to detect single source points instead of the whole time-frequency domain.Time-frequency bins comprising the repeating patterns would have similar values at eachperiod. The inversion of the matrix composing of the corresponding time-frequency ratioswhich get from single source points in a repeating period can obtain the estimate of the sourcesignals. Simulation experiment results show that the proposed method can reduce51.90%oftime and56.72%of detected time-frequency windows with the same separation accuracy.Third, underdetermined speech-music separation algorithm combining empirical modedecomposition with information maximization is researched. In order to solveunderdetermined source separation, algorithm combining empirical mode decomposition withinformation maximization which can only solve when the number of observations is not lessthan the number of sources is used. The new observations are composed by the original signaland the sum of the intrinsic mode functions according to the similarity of the reconstructedsignal and the original signal. The underdetermined blind source separation is transformed topositive definite blind source separation. The following algorithm takes mutual information asthe objective function and natural gradient method as the optimization algorithm. Simulationexperiment results show that the proposed algorithm can solve underdetermined blindseparation effectively.
Keywords/Search Tags:speech-music separation, Newton downhill method, time-frequency ratio, musical repeating structure, empirical mode decomposition
PDF Full Text Request
Related items