Font Size: a A A

Improvement Of Unsupervised Speech Retrieval Based On Acoustic Segmentation Model

Posted on:2021-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z J WeiFull Text:PDF
GTID:2518306308959709Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the advent of the 5G era,it is necessary to deal with a large number of multimedia data,and the demand for audio retrieval technology is increasing day by day.Today,Baidu can retrieve information,mainly using text retrieval technology.Nowadays,although Baidu can retrieve relevant videos based on keywords,it is also because the audio files have text abstracts or are identified by the speech recognition Acoustic Speech Recognition(ASR)engine.However,there are still many problems in ASR,such as zeroresource voice,congenital problems with ASR,and most audio files are not well indexed or tagged.Therefore,we need to study the technology of speech-to-speech retrieval,as long as we can find audio-visual files with voice without speech recognition,so it is not limited to many problems of speech recognition,let alone spend a lot of resources on speech tagging.It can also be used for speech retrieval of languages with few speech resources or languages without words.In this paper,we mainly study and improve the technology of unsupervised speech retrieval based on acoustic segmentation model and the improvement of the preparation of speech retrieval.In the pre-processing stage of speech retrieval,two improved methods are proposed to solve the problems of segmentation and clustering efficiency in previous speech segmentation(segmentation)and its clustering.In order to solve the problem of computational efficiency without speech segmentation,an improved method based on fast HAC algorithm is proposed.in order to solve the problem of fixed length of speech segmentation,a combination of Siamese and SPP is designed to realize the fixed dimension of speech segmentation and reduce the data dimension,which reduces the burden of later calculation.In the stage of unsupervised speech retrieval,an improved DTW-based method is proposed to solve the problems of speech speed and computational efficiency in the traditional segmented DTW method.In the retrieval stage,explore the application of transfer text retrieval technology to speech retrieval.In the course of the experiment,the unsupervised speech segmentation step based on the improved method based on fast HAC algorithm is 65 times higher than the traditional method.Based on the combination of Siamese and SPP,the fixed dimension of speech segmentation and reducing the data dimension improves the clustering speed.After the speech database and voice instructions are decoded,the text-based retrieval method is feasible.
Keywords/Search Tags:Speech Retrieval, unsupervised, speech Segmentation, Acoustic Model
PDF Full Text Request
Related items