Font Size: a A A

Research On Semi-supervised Learning-based Automatic Speech Annotation

Posted on:2020-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiuFull Text:PDF
GTID:2428330575474155Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As a mature application field of artificial intelligence,speech recognition has been sought after by major merchants and research institutions.From smart speaker melee to endless stream of voice interaction applications,it can be said that Speech recognition is far superior to other artificial intelligence in commercialization and technology.However,the development of intelligent speech requires a large amount of voice training data as a support,but there are two problems in the acquisition of training data.First,there are difficulties in obtaining source data result in a small amount of insufficient for training.Second,the source data is of low quality and requires a lot of manual labeling and screening.The high quality of manually labeled high quality samples limits the threshold of speech recognition and severely restricts the development of speech artificial intelligence in small sample areas.Therefore,compared with the traditional full-supervised learning training and manual labeling,the research on automatic annotation method based on semi-supervised learning has a good advantage in small sample field and labeling cost,and it is worthwhile to make innovative attempts.This paper mainly focuses on the classification and classification of speech data in small sample professional fields.The innovation in the research process is to apply the semi-supervised learning methods and ideas to the speech annotation model,and apply it according to the actual business needs in the subsequent classification model.The on-demand weighting method greatly improves the efficiency of voice keyword classification.The research content mainly includes two models,namely semi-supervised learning speech keyword annotation model and on-demand weighted decision tree classification optimization model.Firstly,the traditional conditional random field model is optimized,that is,the cooperative training technique in the semi-supervised learning method is applied comprehensively,and the initial classifier is trained according to the high-quality original speech small sample annotation set,and then according to the characteristics of semi-supervised learning,The high confidence intermediate result obtained by the new training is added to the initial label set until the label classification result converges to obtain the keyword label data;the second step is to optimize the model by the on-demand weighted decision tree to identify keywords in the relevant label field(such as medical,traffic,sanitation,etc.)to perform weighting in the process of screening the split root nodes,so as to finally complete the classification and labeling of the sample audio.Compared with the traditional supervised model method,the multi-round experiment has the advantage that the small sample training data can obtain better accuracy;it is more practical for some companies or fields where training data is difficult to obtain.Compared with traditional manual labeling,the accuracy of semi-supervised models is far less than that,but because it can save a lot of cost,it can be used for preliminary labeling before manual labeling at the current stage.
Keywords/Search Tags:Semi-supervised learning, Voice annotation, Weighting on demand, Conditional random field
PDF Full Text Request
Related items