Research Of Speech Recognition Model Based On Convolutional Neural Network And Its Training Optimization

Posted on:2022-07-23

Degree:Master

Type:Thesis

Country:China

Candidate:Y Q Wang

Full Text:PDF

GTID:2518306575464004

Subject:Industrial Engineering

Abstract/Summary:

In recent years,with the development of robot technology and artificial intelligence technology,speech recognition technology has been widely used.Many researchers at home and abroad have applied convolution neural network to speech recognition and achieved effective results.However,the speech recognition models based on convolutional neural networks still have the problems of large parameters,low recognition efficiency and difficulty in training.This thesis under the condition of ensuring the recognition rate,improves the model recognition speed and the convergence speed of model training by improving the acoustic model of speech recognition and optimizing the model training method.In addition,the improved algorithm is validated on the robot platform.Firstly,aiming at the problem of low level feature disappearance,large parameter and low recognition efficiency caused by network deepening in Convolution Neural Networks(CNN)acoustic model,an improved asymmetric Dense connection convolution neural Network(IV3-DenseNet)acoustic model is proposed.This model uses Dense Block to establish connection relationships among different layers to preserve low-level features and enhance feature propagation,and the scope of convolution kernel is expanded and integrated.In addition,asymmetric convolution is used to decompose convolution kernels to reduce the amount of parameters.The results of the study are that the ASR performance of the model is 2.76% higher than that of the classical depth residual CNN model on the THCHS30 dataset.Compared with the classical DenseNet network model,the model further reduces the network parameter and improves the recognition speed of the model.In order to solve the difficulty of training the CNN speech recognition model,a batch Back Propagation(BP)parallel training method based on Hadoop is proposed to accelerate the training of CNN speech recognition model.In a fully distributed environment,this method uses the Map Reduce Parallel Computing Framework to divide the training data into several small datasets as input of several sub-nodes,and uses the batch BP algorithm to update the CNN model parameters.In addition,the Reduce phase is improved in this thesis,the trimmed mean of local parameter solution for each subnode is taken as the global parameter solution on the primary node,and the next iteration is determined by tests.The experimental results reveal The improved parallel training method improves the recognition rate by 4.35 times compared with the serial training method and improves the speedup and recognition rate compared with the standard BP parallel training method.Finally,the batch BP parallel training method proposed in this thesis is used to train the IV3-DenseNet acoustic model,a speech recognition system is built,and it is validated on the sweeper robot platform to test the recognition rate of the sweeper robot to complete the corresponding voice instructions.The experimental results demonstrate the feasibility of the proposed model.

Keywords/Search Tags:

speech recognition, convolutional neural network, DenseNet, hadoop, parallel training

Related items

1	Research On Speech Enhancement Algorithm Based On Full Convolutional Neural Network And DenseNet
2	The Algorithm Study For Continuous Speech Recognition Based On Convolutional Neural Network
3	Palmprint Recognition Method Based On Deep Convolutional Neural Network
4	Research On Dynamic Gesture Recognition Method Based On Convolutional Neural Network
5	Research On Acoustic Modeling For Speech Recognition Based On Deep Neural Networks
6	Research Of Deep Learning Based Low-resource Speech Recognition
7	Research On Speech Synthesis Vocoders Using Convolutional Neural Networks
8	Gesture Recognition Method Based On Convolutional Neural Network
9	Research On End-to-end Speech Recognition Based On Convolutional Neural Networks
10	End-to-End Speech Recognition Based On Convolutional Neural Network And Gated Recurrent Unit