Font Size: a A A

Research On Punctuation Prediction Method For Speech Transcription

Posted on:2022-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:N P YeFull Text:PDF
GTID:2518306542463014Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The main task of speech transcription-oriented punctuation prediction is to add punctuation to the transcription after the automatic speech recognition system,which is mainly used to enhance the readability and intelligibility of the text.In automatic speech recognition tasks,punctuation prediction is a basic task in the post-processing technology of automatic speech recognition.It plays an indispensable role in many speech recognition applications,such as Intelligent speech conferencing,Intelligent vehicle speech system and Intelligent speech customer service.In recent years,the punctuation prediction based on text information has been well applied in the punctuation prediction task.With the development of Chinese speech recognition technology,punctuation prediction for Chinese speech recognition has also received much attention.Therefore,this thesis focuses on text-based punctuation prediction.The main contribution of this thesis can be summarized as follows:(1)In the actual automatic speech recognition system,transcription errors are common.However,due to the assumption of Independent Identical Distribution(i.e.IID)in deep learning based methods,the punctuation prediction models which learn their parameters on the standard clean training data,are not competent on such noisy testing data with massive transcription errors.Additionally,using manually restore punctuation of speech transcription as training data is a huge and time-consuming project.To this end,on the public dataset IWSLT,we propose three data augmentation techniques to simulate erroneous test data using clean training data: random deletion,random substitution,and random homonym substitution.Extensive experiments on the IWSLT testing set show that the proposed data augmentation method effectively simulates noisy data to a certain extent,and achieves higher accuracy than the advanced text-based punctuation prediction algorithm.This evidences the proposed method is suitable for common English punctuation prediction tasks.(2)In the task of punctuation prediction for Chinese speech transcription,different word classifications,sentences can be understood into different meanings.Since Chinese sentences do not have clear word-breaking boundaries.This results in a sentence with ambiguity and other issues.Meanwhile,since there are more labels for non-punctuation symbols than for punctuation symbols in the training data,which results in unbalanced labels in the training data.Therefore,the training model will have a "preference" for some labels.To solve the above problems,we present a punctuation prediction method based on multi-feature and multi-task.In the framework of sequence labeling tasks,character level information,word type and word boundary information are also considered to improve the ability of punctuation prediction models to capture contextual information.Then,the label with punctuation in the sentence is transformed into a punctuation label,so the original task can be transformed into a binary classification task to participate in the punctuation prediction task.These two tasks can learn from each other and alleviate the problem of unbalanced label.The experimental results on the People's Daily dataset show that the proposed multi-feature and multi-task model exchange information between the two tasks and thus learn better features.
Keywords/Search Tags:Punctuation prediction, Automatic speech recognition, Data augmentation, Multi-feature, Multi-task
PDF Full Text Request
Related items