Research On Key Technology Of Speech Recognition Software

Posted on:2018-09-19

Degree:Master

Type:Thesis

Country:China

Candidate:J S Zhang

Full Text:PDF

GTID:2348330542467848

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Speech recognition literally is to allow the computer or other machines to hear what people said and make a certain judgment.In essence,it belongs to the pattern of matching,its fundamental purpose is to study the device with the auditory function.Through the simple voice input,successful devices could understand the speaker's intention and make a response corresponding to the command.Speech recognition is a very complex cross discipline,involving linguistics,acoustics,computer science,physiology,digital signal processing and so on.In this paper,based on the analysis of the principle of open source speech recognition,voice signal analysis and dynamic time regularization algorithm,a speech recognition system with simple function is constructed for the specific people's pattern matching.The main contents of this paper are as follows:Firstly,the basic concepts and algorithms of speech recognition and some of system processing modules are introduced,meanwhile,the history of the development of speech recognition at home and abroad are summarized,also the research background and significance of this paper are expounded.Secondly,the paper analyzes the characteristics,structure and types of speech recognition system,and the problems of speech recognition application,and introduces several algorithms which are widely used in comparison.Then according to the reading method used in the speech recognition system itself,the audio format is introduced.Remarkably,the storage principle and file format of WAV format audio file are introduced in detail.The following section explains the generation model of speech signal,digitization and processing,and obtains and analyzes the characteristic parameters of audio signal in time domain and frequency domain.It is worth mentioning that both the DTW(dynamic time regularization)algorithm and its optimization are introduced elaborately.At last,based on before mentioned work,a lot of testing about programming are operated and the results are saved at the same time.Each test is dedicated to adding new features on the basis of the previous test.Although some of test results are not expected,they may give a birth to a new perspective to develop other feasible functions surprisingly.Finally,the further work such as accent dialect recognition with the method of clustering,the addition of the GUI graphical user interface,the embedded program application and so on could be put the next agenda.Speech recognition based on DTW algorithm has a great advantage in endpoint detection,and it can detect the location of effective voice quickly and improve the recognition accuracy and recognition speed.In this paper,the process of speech preprocessing,endpoint detection,feature parameter extraction,model training,model matching and recognition of speech are simulated in MATLAB environment.To prove the rationality of voice recognition function,a number of voice recognition tests are done with the multiple self-built sets of sound templates.Finally,the prospect of speech recognition is prospected.

Keywords/Search Tags:

Speech Recognition, DTW Algorithm, Audio Processing

PDF Full Text Request

Related items

1	Research On Two Typical Speech Processing Applications Based On Deep Learning
2	Key Technology Research On Audio Information Hiding And Information Security Application For Speech Recognition
3	Research And Implementation Of Audio Quality Evaluation And Speech Recognition Preprocessing Technology
4	Speech Endpoint Detection Based On Audio And Visual Features
5	Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition
6	Bimodal Speech Recognition Technology Research Based On Audio And Video
7	Research On Noise Treatment Of Speech Recognition With Lip-movement Information
8	External/internal data fusion testbed: History, components, and experimental analysis (speech processing, audio processing functions)
9	Research On Audio-Video Information Processing Based On Lip-Changing
10	Audio-Visual Speech Recognition And Its Applications