Research On Punctuation Prediction Method For Speech Transcription

Posted on:2022-01-07

Degree:Master

Type:Thesis

Country:China

Candidate:N P Ye

Full Text:PDF

GTID:2518306542463014

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The main task of speech transcription-oriented punctuation prediction is to add punctuation to the transcription after the automatic speech recognition system,which is mainly used to enhance the readability and intelligibility of the text.In automatic speech recognition tasks,punctuation prediction is a basic task in the post-processing technology of automatic speech recognition.It plays an indispensable role in many speech recognition applications,such as Intelligent speech conferencing,Intelligent vehicle speech system and Intelligent speech customer service.In recent years,the punctuation prediction based on text information has been well applied in the punctuation prediction task.With the development of Chinese speech recognition technology,punctuation prediction for Chinese speech recognition has also received much attention.Therefore,this thesis focuses on text-based punctuation prediction.The main contribution of this thesis can be summarized as follows:(1)In the actual automatic speech recognition system,transcription errors are common.However,due to the assumption of Independent Identical Distribution(i.e.IID)in deep learning based methods,the punctuation prediction models which learn their parameters on the standard clean training data,are not competent on such noisy testing data with massive transcription errors.Additionally,using manually restore punctuation of speech transcription as training data is a huge and time-consuming project.To this end,on the public dataset IWSLT,we propose three data augmentation techniques to simulate erroneous test data using clean training data: random deletion,random substitution,and random homonym substitution.Extensive experiments on the IWSLT testing set show that the proposed data augmentation method effectively simulates noisy data to a certain extent,and achieves higher accuracy than the advanced text-based punctuation prediction algorithm.This evidences the proposed method is suitable for common English punctuation prediction tasks.(2)In the task of punctuation prediction for Chinese speech transcription,different word classifications,sentences can be understood into different meanings.Since Chinese sentences do not have clear word-breaking boundaries.This results in a sentence with ambiguity and other issues.Meanwhile,since there are more labels for non-punctuation symbols than for punctuation symbols in the training data,which results in unbalanced labels in the training data.Therefore,the training model will have a "preference" for some labels.To solve the above problems,we present a punctuation prediction method based on multi-feature and multi-task.In the framework of sequence labeling tasks,character level information,word type and word boundary information are also considered to improve the ability of punctuation prediction models to capture contextual information.Then,the label with punctuation in the sentence is transformed into a punctuation label,so the original task can be transformed into a binary classification task to participate in the punctuation prediction task.These two tasks can learn from each other and alleviate the problem of unbalanced label.The experimental results on the People’s Daily dataset show that the proposed multi-feature and multi-task model exchange information between the two tasks and thus learn better features.

Keywords/Search Tags:

Punctuation prediction, Automatic speech recognition, Data augmentation, Multi-feature, Multi-task

PDF Full Text Request

Related items

1	Research On The Author Recognition Of Fine-Art Paintings Based On Multi-Task Multi-Layer Feature Fusion Densenet
2	Research On Multi-dimensional Speech Recognition Technology Based On Multi-task Neural Network
3	Research On Speech Emotion Recognition Based On Multi-Attention Mechanism And Multi-Task Learning
4	Research On Application Of Data Augmentation Based On Different Speech Habits In Speech Recognition In Telephone Scene
5	Research On Speech Emotion Recognition Method Based On Multi-feature Fusion
6	Research On Chinese Speech Transcription Punctuation Prediction Based On Deep Learning
7	Research And Implementation Of Data Augmentation And Multi-Task Learning Algorithms In Finger Vein Recognition
8	Research On Data Augmentation Technology For Speech Recognition Application
9	Research On Human Behaviors Representation And Recognition Based On Multi-feature
10	Research On Speech Emotion Recognition Based On Multi-scale Feature Fusion And Decision Tree CNN