Font Size: a A A

Design Of Speech Recognition Algorithm For Human Computer Interaction In Machine Operation Environment

Posted on:2022-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:H C LuoFull Text:PDF
GTID:2518306524487594Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence,the production mode of the factory is becoming more and more intelligent,and the application of human-computer interaction in the production process is more and more extensive.Voice is an important way of human-computer interaction.Nowadays,speech recognition technology has gradually matured,which can accurately recognize most of the voice instructions in the noiseless scene.However,the environment of the factory is not quiet,and it will be mixed with complex and changeable noise,which will cause serious interference to human-computer interaction,and reduce the accuracy of speech recognition and production efficiency.Speech enhancement technology is used to separate pure speech from noisy speech,improve the clarity and intelligibility of target speech,so as to ensure the high efficiency of human-computer interaction.Most of the traditional speech enhancement and recognition methods are based on statistical estimation,which is simple and easy to implement,but the assumption is single,and the real situation is not considered.The speech recognition and speech enhancement methods based on deep learning do not need to make any assumptions on speech,but directly establish the mapping relationship between input and output.After many times of training,they can infer the target sequence well by using speech features.In this paper,the mechanical arm used for human-computer interaction in noisy environment is taken as the object,the speech enhancement algorithm and speech recognition algorithm are analyzed and studied,and the mechanical arm assembly system based on speech recognition is designed to realize its speech interaction function.The main work and innovations are as follows:(1)A speech enhancement model based on Fourier gated convolution neural network algorithm is proposed.The time convolution module,the module similar to Fourier transform and inverse Fourier transform and the gating convolution unit are used to extract the features of the noisy speech in frequency domain and establish the sequential modeling in time domain.The better pure speech is obtained,and the denoising of the mixed speech in time domain and frequency domain is effectively realized.(2)A speech recognition model is proposed,which includes a feed-forward sequential memory network with gating unit and residual connection and a transformer with interactive algorithm.In the acoustic model,a feed-forward sequential memory network structure with residual connection and gating unit is adopted,which makes full use of the context information of speech frame to get the target phoneme or Pinyin,and in the language model,a transform with interactive algorithm is adopted Mer model improves the reasoning performance from phoneme to text.(3)The speech enhancement model and speech recognition model are applied to the manipulator to realize the speech recognition interaction of the manipulator in its operating environment,and command it to make corresponding actions with voice commands,such as grabbing parts,changing the orientation and putting down parts.
Keywords/Search Tags:Human computer Interaction, Speech recognition, Speech enhancement, Denoising, Transformer, Manipulator
PDF Full Text Request
Related items