Research Of End-to-End Voice Wake-up

Posted on:2020-07-23

Degree:Master

Type:Thesis

Country:China

Candidate:N Zhang

Full Text:PDF

GTID:2428330602468133

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of artificial intelligence and the increasing demand for human-computer interaction,intelligent speech technology has achieved unprecedented breakthroughs.The research results in the field of intelligent speech not only promote the advancement of cutting-edge technology,but also create huge market value.Therefore,speech technology is of great significance to us.Voice wake-up is an important research direction in the field of intelligent speech.Its task is to identify a given set of wake-up words in a continuous stream of speech.For the voice wake-up task with enrollment utterances,this study establishes a deep supervector based voice wake-up system to meet the wake-up requirement.For the task of fixed wake-up word,this study focuses on end-to-end(E2E)technology and implements an end-to-end voice wake-up system.In addition,it optimizes system parameter configuration and improves system performance through the application of various deep learning models.The main work of this study includes:1.Completely organized the main line of the development history of speech recognition,and conducted a thorough and detailed investigation on the predecessors'work,research status and latest developments of voice wake-up and end-to-end technology.2.A voice wake-up system based on deep supervector is established for the voice wake-up task with enrollment utterances.The system uses DNN as a feature extractor to extract the deep supervectors of the speech,and finally calculates the Cosine similarity between the deep supervectors of test speech and the deep supervectors of the templates.Experimental results show that systems based on deep supervector have comprehensive performanced over systems based on segmental DTW(S-DTW).3.This study also implements an end-to-end voice wake-up system.The system only needs a pre-trained neural network as an acoustic model.After feeding the acoustic features,the forward propagation algorithm and the posterior probability post-processing module of the neural network can output the confidence score of the wake-up word in the process.An end-to-end framework is implemented without the need for a complicated decoding process.Moreover,this paper introduces various deep learning models including TDNN,LSTM,GRU and TDNN-F as acoustic models into the system.Through multiple experiments,the system performance of each model is fully compared,and the experiments verified the effectiveness of the end-to-end wake-up system.

Keywords/Search Tags:

Voice Wake-up, Deep Supervector, End-to-End, TDNN, GRU, TDNN-F

PDF Full Text Request

Related items

1	The Research And Application Of Voice Wake-up With Deep Learning
2	Study On Wake Up Word Recognition Based On Deep Learning
3	Application And Implementation Of Voice Wake-up Technology In Voice Assistant System
4	Speaker Recognition System Based On Deep Learning
5	Design And Implementation Of Deep Learning-based Open Speech System For Innovative Enterprises
6	Yi Language Speech Recognition Using Deep Learning Methods
7	Research On Shortest Path Model And Algorithm Of Fuzzy And Time-varying Neural Network
8	The Research On Voice Wakeup Technology Based On Transfer Learning
9	Research And Implementation Of Anti-spoofing Voiceprint System Based On Android
10	Research On Chinese Speech Recognition Based On Kaldi