Font Size: a A A

Research On Sound Compression Method And Technology Based On Incremental Learning

Posted on:2019-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:H SongFull Text:PDF
GTID:2428330566983450Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In order to alleviate the huge application pressure of a large-scale speech data need to be transferred and stored in the case of rare bandwidth resources.Combining with signal sparsity,sparse representation and incremental learning,this paper analyzes the sparsity of the spectral modulus envelope of speech data to make sure that the voice will be compressed and decompressed efficiently in transmission and storage.The main idea is to get an efficient model of the voice amplitude spectrum,which combined with sparse dictionary learning theory.And then proposed an incremental dictionary learning method to compress the streaming voice amplitude spectrum incrementally.The main tasks include:1)The proposed dictionaries and learning methods combined with incremental learning theory are used to deal with the continuous generation of speech data.Incrementally learning the training envelopes one by one,and gradually building a complete envelope dictionary.And then the learned dictionary was used to represent the original envelopes in sparsely,which means that using a very small number of dictionary atomics' linear combinations to represent the current original envelopes,and finally by storing the index of the dictionary atom and its corresponding coefficients to achieve the purpose of compression instead of storing the original envelopes.2)An optimization strategy analysis of the proposed method: studying on the Hilbert transform theory,this paper proposed a strategy to extend the dictionary's atomic diversity which uses the characteristics of the Hilbert transform.Using the Hilbert transform,those atoms added into the dictionary will be transformed to be new atoms and can be used in the sparse representation process.That is,the original envelopes and their Hilbert transform are used to fit the other envelopes together.It is not necessary to store the results of the Hilbert transform of those atoms in the compressed file.Instead,only their ids and coefficients will be stored in file and represented the original amplitude spectrum.So that,the optimization strategy can improve the expression ability of the dictionary,reduce the size of the dictionary and speed up the compression efficiency,however it still need a little extra compression space.Using the publicly available speech data set PTDB-TUG to compare the proposed method and ODL algorithm.The final results show that the proposed method is superior to ODL algorithm in compression rate,reconstruct quality and SSNR.And the optimization strategy achieves the goal of improving the expressiveness of the dictionary,reducing the dictionary capacity,and accelerating the compression efficiency.The proposed method is flexible to deal with the streaming voice amplitude spectrum,and finally,provides diversity of choice and guidance for specific applications.
Keywords/Search Tags:voice compression, voice decompression, real-time processing, streaming data, incremental learning, sparse dictionary learning
PDF Full Text Request
Related items