Font Size: a A A

The Extension And Selection Of Training Samples For Speech Keyword Recognition And The Implementation Of The System

Posted on:2021-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2428330611965961Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The voice of customer service conversations with customers recorded in the call center of financial companies is a very valuable data resource.Accu Rate extraction of keywords in voice data can effectively obtain customer opinions,potential needs,and quality of customer service.The traditional manual quality inspection method has the problems of low sampling Rate and high sampling time cost.This article takes the call center customer service and customer dialogue voice as the analysis object,discusses the problem of voice keyword recognition in the financial field,and proposes effective solutions.The main work and contributions of this article are as follows:(1)Aiming at the insufficient training samples of individual keywords and the imbalance of speakers' gender in the training samples,a training sample expansion method based on speech conversion is proposed.This method mainly expands the speaker diversity of training samples by changing the speech spectrum,thereby improving the robustness of the system recognition.The experimental results based on 10 keywords in the AISHELL data set show that: in the case of insufficient training samples,the method of this paper improves the detection Rate of 3.44%-5.95%;in the case of imbalanced speaker gender information,the method of this paper improves by 0.81%-3.08% detection Rate.(2)On the basis of obtaining a large number of training samples using the expansion method of(1),this paper proposes a training sample screening method based on the improved contour coefficient evaluation idea to further improve the performance of the classifier.This method uses the UBM-GMM to model the original speech,and uses the evaluation idea of improved contour coefficient to screen the extended training samples.The experimental results based on 10 keywords in the AISHELL data set show that: compared with the training samples obtained by the random method,based on the training samples screened by the method in this paper,the keyword detection Rate of the model increased by 0.26%-2.51%.(3)Because the system requires extremely high accuracy,this paper uses a keyword recognition method based on RNN-CTC.This method does not require labeling and alignment of samples,which is conducive to engineering implementation,and RNN is the first choice network for time series data.It can achieve very high accuracy,which is conducive to the improvement of recognition accuracy in this paper.Using AISHELL data set and Hakka dialect data set for experimental evaluation,the results show that: compared with the DNN method,the recognition accuracy of this method is improved by 11.70%-12.72%.In summary,this paper proposes a training sample expansion method based on voice conversion and a training sample screening method,and designs an RNN-CTC keyword recognition model applied to the financial customer service quality inspection system.The experiment analyzes the performance of the proposed method and the designed system,and verifies their effectiveness.
Keywords/Search Tags:Speech keyword recognition, Voice conversion, Training sample expansion, Sample screening
PDF Full Text Request
Related items