Design And Implementation Of Embedded Speech Recognition System Based On Deep Learning

Posted on:2022-09-26

Degree:Master

Type:Thesis

Country:China

Candidate:X Zhuang

Full Text:PDF

GTID:2518306524480264

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence technology,speech is not only the medium of human communication,but also facilitates the performance of humancomputer interaction.Recently,with the rapid development of speech recognition technology,it has begun to be gradually applied to various fields.Deep learning has made a qualitative leap in the accuracy of speech recognition.The increasingly complex network model is difficult to apply to embedded devices.And in actual speech scenes,there are always various noises,such as environmental noise,equipment noise,engine noise,etc.,which will affect the performance of speech recognition.In the case of ensuring the accuracy of speech recognition,how to compress the model to the most suitable for embedded devices has become a problem for many scholars.This thesis designs a lightweight and end-to end Chinese speech recognition model based on deep learning,which is transplanted to embedded devices for testing.The specific work is as follows:1.To address the problem that there are not many open source datasets for speech and the real scene has noise in the speech environment,this thesis collects and organizes Chinese open source datasets into Large-Dataset,and designs a noise suppression algorithm by integrating deep learning methods into traditional signal processing methods,which can reduce the character error rate by 1.48% when tested on noisy datasets.2.To address the problem that speech recognition models are generally large,this thesis investigates an end-to-end speech recognition scheme that uses convolutional kernels as the core of the backbone network,solves the problem of long-distance dependence by GCN,designs a fully convolutional lightweight neural network,and uses CTC to solve the problem of unequal length of input and output for automatic alignment.3.To address the problem of extremely unbalanced distribution of Chinese character samples,this thesis combines the idea of Focal Loss with CTC Loss,so that it has different attention to Chinese character samples with different distributions and reduces the impact of unbalanced samples on speech recognition accuracy,and obtains a reduced character error rate of 0.85%.4.To address the problems of small memory and insufficient computational power in embedded environment,this thesis uses 8Bit weight quantization technique to compress the model to nearly one-fourth of the original one.At the same time,a shift quantization acceleration scheme is designed to optimize the model weights after 8Bit quantization by designing a suitable codebook,converting a large number of convolutional multiplication operations into a shift-and-sum mode,and increasing the inference speed of the model on the embedded system by 40% times with a loss of 0.6%character error rate.

Keywords/Search Tags:

speech recognition, noise suppression, lightweight neural network, shift quantization, weight quantization

PDF Full Text Request

Related items

1	Research On New Method Of Rubust Speech Recognition
2	Neural Network-Based Speech Keyword Recognition Algorithm And Circuit Design For Low Signal-To-Noise Ratio
3	Research And Design Of Lightweight Anti-noise Speech Recognition Algorithm
4	A Research On Signal Recognition Based On Lightweight Network
5	Research On Iris Recognition Method Based On Lightweight Neural Network
6	The Application Of SOFM And Direct Vector Quantization To LD-CELP Speech Coding Algorithm
7	Research On Quantization Methods Of Weight And Gate Parameters In Lstmneural Network Model
8	Research On Convolutional Neural Network Lightweight Method Based On Dilated Convolution And Piecewise Quantization
9	Research On Binary Quantization Methods Of Deep Learning Models
10	Research On Vector Quantization In Speech Recognition