Improvement Of Unsupervised Speech Retrieval Based On Acoustic Segmentation Model

Posted on:2021-08-03

Degree:Master

Type:Thesis

Country:China

Candidate:Z J Wei

Full Text:PDF

GTID:2518306308959709

Subject:Computer Software and Application of Computer

Abstract/Summary:

PDF Full Text Request

With the advent of the 5G era,it is necessary to deal with a large number of multimedia data,and the demand for audio retrieval technology is increasing day by day.Today,Baidu can retrieve information,mainly using text retrieval technology.Nowadays,although Baidu can retrieve relevant videos based on keywords,it is also because the audio files have text abstracts or are identified by the speech recognition Acoustic Speech Recognition（ASR）engine.However,there are still many problems in ASR,such as zeroresource voice,congenital problems with ASR,and most audio files are not well indexed or tagged.Therefore,we need to study the technology of speech-to-speech retrieval,as long as we can find audio-visual files with voice without speech recognition,so it is not limited to many problems of speech recognition,let alone spend a lot of resources on speech tagging.It can also be used for speech retrieval of languages with few speech resources or languages without words.In this paper,we mainly study and improve the technology of unsupervised speech retrieval based on acoustic segmentation model and the improvement of the preparation of speech retrieval.In the pre-processing stage of speech retrieval,two improved methods are proposed to solve the problems of segmentation and clustering efficiency in previous speech segmentation（segmentation）and its clustering.In order to solve the problem of computational efficiency without speech segmentation,an improved method based on fast HAC algorithm is proposed.in order to solve the problem of fixed length of speech segmentation,a combination of Siamese and SPP is designed to realize the fixed dimension of speech segmentation and reduce the data dimension,which reduces the burden of later calculation.In the stage of unsupervised speech retrieval,an improved DTW-based method is proposed to solve the problems of speech speed and computational efficiency in the traditional segmented DTW method.In the retrieval stage,explore the application of transfer text retrieval technology to speech retrieval.In the course of the experiment,the unsupervised speech segmentation step based on the improved method based on fast HAC algorithm is 65 times higher than the traditional method.Based on the combination of Siamese and SPP,the fixed dimension of speech segmentation and reducing the data dimension improves the clustering speed.After the speech database and voice instructions are decoded,the text-based retrieval method is feasible.

Keywords/Search Tags:

Speech Retrieval, unsupervised, speech Segmentation, Acoustic Model

PDF Full Text Request

Related items

1	Design And Implementation Of Korean Speech Retrieval System
2	Research On Encrypted Speech Retrieval Method Based On Unsupervised Learning Hashing In Cloud Environment
3	Research On Speech Automatic Retrieval Technology For Broadcast News
4	Research On BN Feature Based Acoustic Modeling And Its Application In Keyword Retrieval
5	Research On Encrypted Speech Biological Hash Retrieval Algorithm Based On Content Protection
6	Research On Continuous Speech Recognition Based On Deep Learning
7	Research On Sequence-to-sequence Acoustic Modeling For Speech Generation
8	The Establishment Of Mandarin Speech Emotion Acoustic Characteristic Database
9	A Study On The Extraction Of Speech Depth In Tibetan Language And Its Speech Recognition
10	Design And Implementation Of Intelligent Speech Interaction