Font Size: a A A

Research And Implementation Of Service Speech Template Extraction And Dialogue-based Quality-of-service Evaluation In The Restaurant Scene

Posted on:2021-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2428330626458934Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,the management of service industries such as restaurants has become more and more standardized,and the intelligent management of waiters' service quality and service processes has also received increasing attention from managers.Establishing a set of standard restaurant service dialogue standardization processes can improve the restaurant service level and customer satisfaction.Since restaurant waiters 'speech quality plays a vital role in the service,managers put forward solutions for quantifying the assessment of waiters' speech quality.By collecting and monitoring the contents of waiters,real-time monitoring and recognition of the contents of waiters.And according to the feedback of the monitoring results,the performance of the waiters is scored,and finally the quality of the employees' service is tracked and evaluated.In a conventional environment,since it is not restricted by noise and other conditions,the existing speech recognition model can achieve a high rate of word accuracy.However,due to the noisy environment of the restaurant,there are many uncontrollable factors.The noise not only comes from the voice of the speaker,but also from different tableware and background music played by the restaurant.These factors can affect the waiter in the restaurant environment.An important reason for the conversation content with customers to be converted into text.When the waiter and the customer are in a vertical dialogue scenario,the voice recognition technology cannot more accurately and effectively identify the conversation content between the waiter and the customer,which leads to the existing solutions failing to achieve better recognition effects in a restaurant scene.The low rate of speech recognition in noisy scenes is still the main problem faced by current speech recognition tasks.Especially in the case of high noise in restaurants,the general intelligent template extraction method cannot effectively improve the recognition accuracy.In order to solve the service dialogue recognition in the highnoise restaurant scene,this paper proposes an intelligent template extraction and waiter speech quality evaluation model for the service dialogue in the restaurant scene.The specific work includes two major modules;firstly,this paper designed and optimized the service dialogue intelligent template extraction module,1.proposes a data enhancement method for restaurant-specific scenarios,which collects restaurant noise data,compares different signal-to-noise ratios,and transforms from time-shift,Speed adjustment,mixed white noise and other three ways to enhance the audio data,and through the experimental comparison of data before and after the enhancement of the service dialogue intelligent template extraction recognition rate.2.Build an intelligent template extraction model.The entire template modeling uses the Hidden Markov Model-Deep Neural Network(HMM-DNN),HMM-DNN acoustic model training method.In the acoustic model network structure section,two acoustic model network structures,Time Delay Neural Network-Recurrent Neural Network(TDNNRNN)and Time Delay Neural Network Recurrent-Long Short Term Memory(TDNNR-LSTM),are proposed respectively.The training effect of this kind of network structure is compared with the traditional Time Delay Neural Network(TDNN)acoustic model network structure.Experiments show that the acoustic model training method based on the two network structures of TDNN-RNN and TDNNR-LSTM proposed in this paper can reach 91% under the optimal word accuracy in the restaurant noise scene,which is better than the traditional TDNN acoustic model network structure;Secondly,this paper investigated and implemented the service speech quality evaluation module,1.Propose a weight transfer learning of chat dialogue acoustic model.This method mainly adopts the pre-trained model of the chat dialogue corpus and uses the weight transfer of the acoustic model to solve the waiter in noisy environment.Insufficient dialogue with customers leads to the problem of low accuracy of speech recognition.Experiments show that the error rate of speech quality recognition words before and after the weight transfer differs by about 3%,and the speech quality has been improved to a certain extent.2.Propose a keyword retrieval model based on WFST,generate a keyword index,construct a keyword retrieval framework based on a finite state machine,and use the index reverse search to improve the quality of the service keyword under the noise of the restaurant environment The problem of insufficient evaluation effect.In summary,this paper is mainly designed to quantify and evaluate the service quality of restaurant waiters under the scene of restaurants.It is a service dialogue intelligent template extraction and speech quality evaluation model.This model effectively solves the problem of low recognition rate of dialogues between waiters and customers in high-noise restaurant scenes,greatly improves the restaurant's intelligent management level,and helps managers quantify and evaluate the service quality of waiters,And solved the problem of insufficient quality of speech evaluation caused by mismatch of precise detection of speech keywords.
Keywords/Search Tags:Data Augmentation, Time Delay Neural Network, Acoustic Model, Weight Migration, Keyword Search
PDF Full Text Request
Related items