Font Size: a A A

Application Research Of Small-scale Vocabulary Speech Recognition In Coal Shipment System

Posted on:2021-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:J J ChangFull Text:PDF
GTID:2392330614471750Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
During the coal loading process,the position of the large shovel(coal exit)needs to be adjusted in time to avoid accumulation.The current working process is that Observer A is responsible for observing the coal accumulation at the bottom of the cabin,and transmits the situation to Operator B through the walkie-talkie in advance.B responds in the control room and adjusts the position of the large shovel.The shortcomings of this method are:(1)the work of A and B is not coordinated,resulting in inaccurate and untimely adjustment sometimes;(2)waste of human resources of the enterprise.To solve the above problems,the current study designed a special small-scale vocabulary speech recognition system to replace the work of B.In other words,after A sent the adjustment instruction to the control room,the system can recognize the voice instruction in time and direct the movement of the shovel.The core job is to accurately identify the voice commands used by A.With reference to continuous speech recognition methods,the current study investigates the theory,methods and applications of small-scale vocabulary speech recognition and solves this problem.The main work is summarized as follows:(1)A network model of Bidirectional Long Short Term Memory(BLSTM)is built.Based on the advantages of BLSTM in processing time series data,a five-layer deep learning network with full connection layers and BLSTM layers was built.(2)A special voice command corpus is collected and preprocessed.Under the application background of this subject,the system does not need to recognize large-scale continuous speech,but uses a small number of designated special vocabularies,phrases,fixed collocations and other speeches at high frequency.Therefore,the current study collects data by category,including basic categories,keyword categories,and sentence categories with keywords to correspond to different training stages of the model.In the preprocessing step,the current study extracts Mel-frequency Cepstrum coefficients(MFCC)from the speech and converts the speech data into a feature matrix.(3)The training and prediction methods are optimized.Optimization training process: corresponding to the custom data set,model training is divided into three stages,namely the basic model stage,keyword training stage and final model stage.Optimization prediction process: comparing the prediction process of connectionist temporal classification function based on the greedy algorithm to that based on the beam search algorithm.Experiments results show that the greedy algorithm takes less time than the beam search algorithm,but its accuracy is slightly lower than the beam search algorithm.For overall consideration,the beam search algorithm is selected.Through the comparison and optimization of the above stages,in the experimental environment of the current study,the accuracy rate of the usage of model-specific person is 98%,and the accuracy rate of the usage of non-specific person is 94%,meeting the application requirements.(4)Graphical user interface(GUI)program is designed to realize real-time call and recognition of the model.The graphics are introduced as the GUI front-end to simulate coal exit and the trained model is introduced as the GUI back-end for calculation and identification.This program can collect a user's voice in real time and control graphics to perform corresponding operations.Through this method,the current study can show the application scenarios and the model usage.In addition,for further development and expansion of the application,the methods and precautions of customizing model and corpus are given,improving the portability of the application.
Keywords/Search Tags:Speech Recognition, RNN, GUI, Informatization
PDF Full Text Request
Related items