Font Size: a A A

Design And Implementation Of Speech Transmission And Recognition System Based On Raspberry Pi

Posted on:2020-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:X H QinFull Text:PDF
GTID:2428330578973038Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Human-computer interaction in nature language way as the main target of automatic speech recognition(ASR),which aims at human-computer interaction in nature language way,has been a research hotspot in recent decades.Before 2000,many core technologies related to speech recognition emerged,such as: Mel Frequency Cepstral Coefficients(MFCC),Gaussian mixture model(GMM),Hidden Markov model(HMM),etc.These theories and technologies bring a good opportunity for the development of ASR.In the second decade of the 21 st century,with the popularity of mobile terminals,ASR has ushered in a climax of research.Various new technologies and models have been proposed and applied in practice.In order to reduce the cost of the system and the size of the terminal,facilitate the portability,installation and remote configuration,this paper designs a voice transmission recognition system based on Raspberry Pi,which can realize the collection,transmission and recognition of the broadcast voice in the railway station.The ASR system designed in this paper avoids the shortcomings of traditional voice acquisition and transmission equipment,such as large volume,high cost and large workload.It also has the ability to connect terminals remotely and modify the system configuration more flexibly and conveniently.The system includes two parts,software and hardware,in which hardware mainly implements voice acquisition.Software is divided into two modules,one module makes forwarding service module,which is developed with Python.It mainly forwards the audio stream collected by hardware to recognition module and other related modules.Another module is the recognition module,which realizes the function of receiving,recognizing and storing audio data.Recognition module uses RNN + CTC(Recurrent Neural Network,RNN)(Connectionist temporal classification,CTC)model structure,which is developed by Python,using Tensor Flow and other packages.The requirement of the system is to realize real-time transmission,storage and speech recognition of multi-point audio data.Firstly,the overall design scheme of the system is proposed.Then,the working principles and related algorithms of each module of voice transmission system and ASR system are elaborated in detail.
Keywords/Search Tags:Raspberry Pi, ASR, Voice transmission, RNN, CTC
PDF Full Text Request
Related items