Non-negative Matrix Factorization Algorithm And Its Application In Voice Conversion

Posted on:2017-02-28

Degree:Master

Type:Thesis

Country:China

Candidate:Q M Zhang

Full Text:PDF

GTID:2308330485464132

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

By decomposing a non-negative matrix into the product of a non-negative coefficient matrix and a non-negative basis matrix, the NMF (Non-negative Matrix Factorization, NMF) represents a data as a non-negative linear combination of non-negative components to capture the subspace of a data or obtain dimensionality reduction of data. Comparing with the PCA, the non-negative representation is meaning physically. As an effective data processing technology, NMF has been widely used in such applications as speech recognition, voice conversion, face detection and recognition, text analysis and clustering, network security, digital watermarking, image processing, biomedical engineering etc.This thesis focuses sparse convolutive non-negative matrix factorization algorithm and its application in the voice conversion. The major works are detailed as follows:(1) A convolutive non-negative matrix factorization algorithm based on Itakura-Saito distance and sparse constraint is proposed. Different from the existing NMF which based on Euclidean distance and K-L divergence, the proposed algorithm adopts the Itakura-Saito distance as the objective cost function to measure the error between the original matrix and its reconstruction version. Itakura-Saito distance function has the property of scale invariant, leading to the smaller elements in the matrix has a smaller reconstruction error. The multiplicative update rules based on the objective cost function with sparse constraint on the coefficient matrix is derived. The experimental results show that the reconstructed speech has higher intelligibility by using the proposed NMF algorithm.(2) We apply the proposed convolutive NMF algorithm to converse voice. Convolutive NMF can characterize the delay information of data, which is more suitable for processing of speech signals. To this end, we adopt the convolutive NMF to converse voice. During the training phase, the aforementioned convolutive NMF algorithm based on the Itakura-Saito distance and sparse constrained is used to obtain the source and target speaker’s time-frequency bases respectively. During the conversion phase, the time-frequency spectrum matrix of the source speech is decomposed on the source basis matrix to get the source coefficient matrix, the target speech is reconstructed by the source non-negative coefficient matrix and the target basis matrix. Experimental results show that the source speech is transformed effectively to the target speech with higher intelligibility.

Keywords/Search Tags:

Convolutive non-negative matrix factorization, Sparse contraints, Itakura-Saito distance, Voice conversion, Speech intelligibility

PDF Full Text Request

Related items

1	Research On Non-negative Matrix Factorization Algorithm
2	Research On Underdetermined Convolutive Speech Signal Separation Methods
3	Non-negative Low-rank And Group-Sparse Matrix Factorization And Application In Image Retrieval
4	Research On Speech Enhancement Algorithm Based On Non-Negative Matrix Factorization
5	Research On Two-channel Speech Enhancement Method Based On Non-negative Matrix Factorization
6	Eeg Feature Extraction Based On Non-negative Matrix Factorization
7	A Research Of Voice And Complicated Background Noise Based On CNMF
8	A Non-negative Matrix Factorization Clustering Algorithm Based On L_2,1/2 Sparse Constraint And Cosine Similarity
9	Personalized Recommendation Algorithm Based On Sparse Constrained Non-negative Matrix Factorization
10	Research On Two Methods Of Single Channel Speech Separation