Font Size: a A A

Research Of Speech Recognition Model Based On Convolutional Neural Network And Its Training Optimization

Posted on:2022-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q WangFull Text:PDF
GTID:2518306575464004Subject:Industrial Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of robot technology and artificial intelligence technology,speech recognition technology has been widely used.Many researchers at home and abroad have applied convolution neural network to speech recognition and achieved effective results.However,the speech recognition models based on convolutional neural networks still have the problems of large parameters,low recognition efficiency and difficulty in training.This thesis under the condition of ensuring the recognition rate,improves the model recognition speed and the convergence speed of model training by improving the acoustic model of speech recognition and optimizing the model training method.In addition,the improved algorithm is validated on the robot platform.Firstly,aiming at the problem of low level feature disappearance,large parameter and low recognition efficiency caused by network deepening in Convolution Neural Networks(CNN)acoustic model,an improved asymmetric Dense connection convolution neural Network(IV3-DenseNet)acoustic model is proposed.This model uses Dense Block to establish connection relationships among different layers to preserve low-level features and enhance feature propagation,and the scope of convolution kernel is expanded and integrated.In addition,asymmetric convolution is used to decompose convolution kernels to reduce the amount of parameters.The results of the study are that the ASR performance of the model is 2.76% higher than that of the classical depth residual CNN model on the THCHS30 dataset.Compared with the classical DenseNet network model,the model further reduces the network parameter and improves the recognition speed of the model.In order to solve the difficulty of training the CNN speech recognition model,a batch Back Propagation(BP)parallel training method based on Hadoop is proposed to accelerate the training of CNN speech recognition model.In a fully distributed environment,this method uses the Map Reduce Parallel Computing Framework to divide the training data into several small datasets as input of several sub-nodes,and uses the batch BP algorithm to update the CNN model parameters.In addition,the Reduce phase is improved in this thesis,the trimmed mean of local parameter solution for each subnode is taken as the global parameter solution on the primary node,and the next iteration is determined by tests.The experimental results reveal The improved parallel training method improves the recognition rate by 4.35 times compared with the serial training method and improves the speedup and recognition rate compared with the standard BP parallel training method.Finally,the batch BP parallel training method proposed in this thesis is used to train the IV3-DenseNet acoustic model,a speech recognition system is built,and it is validated on the sweeper robot platform to test the recognition rate of the sweeper robot to complete the corresponding voice instructions.The experimental results demonstrate the feasibility of the proposed model.
Keywords/Search Tags:speech recognition, convolutional neural network, DenseNet, hadoop, parallel training
PDF Full Text Request
Related items