Font Size: a A A

Research And Design Of Speech Recognition And Speech Synthesis In Humancomputer Interaction Of Intelligent Pump Station Platform

Posted on:2022-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:J W YaoFull Text:PDF
GTID:2492306323979229Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Human-computer interaction can reduce manual intervention and free hands,which is an indispensable part in the process of intelligent pump station platform.Al-though the current voice response system has relatively mature technologies and prod-ucts,there are still some problems that need to be resolved in the application of the actual pump station platform:First,the actual pump station control instructions are change-able,and the existing voice recognition model has a low generalization ability,so it can not be used in the actual pump station;second,the existing Chinese speech synthesis has low inference efficiency and low quality.To solve these two problems,this disser-tation researches and designs the voice response technology under the intelligent pump station platform,and realizes the voice human-computer interaction of the pump station platform.The main research contents are as follows:(1)This dissertation records the data set of pump station specific control instruc-tions and improves the data preprocessing method.Once the phoneme and text are misaligned,the synthesized speech quality will be reduced.The traditional alignment method of phonemes and text is soft alignment,which is very prone to alignment errors.So MFA method is used to align the phoneme and text.In addition,by extracting the duration of words and phonemes,the problem of skipping or repeating words in speech synthesis is alleviated.(2)This dissertation designs and implements an end-to-end speech recognition model.The speech recognition model consists of an acoustic model and a language model.In the acoustic model,the speech signal is modeled in space and time at the same time,CNN is used to extract the spatial features of the speech signal,BiLSTM is used to model the time sequence of the speech signal,CTC is used to align the speech frame with the real speech annotation to obtain better recognition effect.At the same time,transfer learning is applied to the acoustic model to fine-tune the acoustic model under the public data set,which improves the recognition rate of the specific control instruction vocabulary of the pump station with a small amount of data;(3)This dissertation designs and implements a Chinese speech synthesis model.The speech synthesis model consists of acoustic model and vocoder.The existing acous-tic models of Chinese speech synthesis are prone to prosodic errors,missing words and repeated words.Therefore,this dissertation designs the fastspeech 2x acoustic model,introduces the variable information adapter to improve the prosodic of the synthesized speech,and adds the limited attention mechanism PNCA to reduce the skipping and backward situations in speech synthesis and improve the speech quality;(4)A voice response system is built under the intelligent pump station platform.It realizes the functions of four modules:speech acquisition,speech recognition,pump station control and status monitoring,and speech synthesis,meeting the needs of the system.
Keywords/Search Tags:Human-computer Interaction, Offline Voice Interaction, Offline Speech Recognition, Offline Speech Synthesis, Pump Station Control
PDF Full Text Request
Related items