Font Size: a A A

Speech Recognition Method Research Based On GPU And Deep Belief Network(DBN)

Posted on:2019-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:T JiangFull Text:PDF
GTID:2428330548476858Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Deep learning,a new trend in machine learning,simulates the hierarchical processing mechanism of the human brain,thereafter extracts the features of the input data from bottom-layer to high-layer step-by-step.Due to its powerful modeling and characterization ability,deep learning has been widely used in the field of speech recognition.However,it takes lots of time to train acoustic model based on deep learning in speech recognition.Compared with traditional Central Processing Unit(CPU),Graphics Processing Unit(GPU)has powerful parallel computing capabilities,which makes GPU more suitable for the training of deep learning model.Aiming at the problem that Deep Belief Network(DBN)ignores the timing dynamic information,this paper presents a DBN based on Memory Model(DBNMM)to improve the accuracy of speech recognition.Moreover,in order to improve the training efficiency of acoustic model based on DBNMM,this paper improves the training of DBNMM model from both single-GPU and multi-GPUs perspectives to accelerate the speed of DBNMM's training.The major contributions of this paper are summarized as follows:(1)The existing models of deep learning and their applications in the field of speech recognition are analyzed.As the DBN pays significant attention to the depth of a neural network,generally ignoring the correlation of long-term timing dynamic dependency of speech signal,the loop function of Recurrent Neural Network(RNN)model is analyzed from the view of information processing and the DBNMM model is proposed to add a FIR filter as a memory model in hidden layers of DBN model to store the historical information of speech signals.Combined with the characteristics of feature extraction for different hidden layers in DBNMM,in other words,the more abstract features are extracted by higher layer,different types of memory model are used.Meanwhile,in order to simplify the complexity of memory model,the low-rank decomposition is used in weight matrix between the hidden layers.Basically,the stride-based DBNMM model combines the locally relevant characteristics of speech signals is proposed.Experimental results show that the DBNMM model can obtain better performance in terms of speech recognition compared with RNN and LSTM models on data sets.(2)Since the DBNMM has many parameters of weigh matrix,a single GPU cannot store them at one time.This paper presents a slice weight matrix method for DBNMM model where the weight matrix is divided into several sub-weight matrices.Meanwhile,the connection between hidden units and visible units is taken as the thread calculation unit so that the weight parameters of a unit of hidden layers can be stored in the shared memory of GPU.This method improves the training efficiency of the model.At the same time,in order to make full use of the computing resources of GPU,stream parallelism of CUDA is used where data transfer and kernel execution are performed simultaneously in different streams.To overcome the limited parameter transfer issue in multi-GPUs,the data parallel method is used and the idea of delay updating is combined,an algorithm,called as ASGD algorithm,is presented based on dividing and merging rule(DM-ASGD)where each subset data are trained for a certain number of iterations,the fastest GPU is chosen and the parameters information,such as gradients,are transferred to the fastest running GPU.Then,the trained parameters of model which are trained by the fastest for a certain number of iterations are transferred to the other GPUs for the next training.Repeat the above process until all the subset data are trained.Finally,the parameters of each GPU are combined to obtain the result of final model.Experimental results demonstrate that the proposed method of DBNMM model's training based on GPU can significantly improve the training efficiency under the premise of ensuring the accuracy of speech recognition.
Keywords/Search Tags:Speech Recognition, Deep Belief Network, Memory Model, GPU, Weight Matrix
PDF Full Text Request
Related items