In recent years,with the improvement of the level of Automatic Speech Recognition(ASR)technology,research on the processing of spoken transcript obtained from ASR systems has also received widespread attention.Since the spoken transcription is not a written text,it will bring a series of serious problems.On the one hand,the spoken transcribed text is composed of a stream of characters generated by the output of the ASR system.It lacks punctuation and sentence boundary information.This is why it is difficult for the reader to find the beginning and end of a sentence when reading the transcribed text,which greatly increases the difficulty of understanding the semantics of the sentence.On the other hand,spoken transcriptions contain a lot of disfluency.These disfluencies neither conform to the grammar of the sentence nor actual semantic information,which not only greatly hinders subsequent Natural Language Processing tasks,but also makes it difficult for readers to smoothly read spoken transcribed text.Thus,it is necessary for us to predict punctuation and detect disfluency in spoken Chinese transcribed text.This thesis mainly studies Punctuation Prediction and Disfluency Detection for spoken Chinese text.Specific research work includes:1.Aiming at the punctuation prediction research of Chinese spoken transcription text,this thesis proposes a punctuation prediction method combining deep pre-training and traditional recurrent network.This method uses the idea of sequence labeling to construct a solution for punctuation prediction.Specifically,the BLSTM model is extended by introducing a bidirectional Transformer encoder,and its strong ability to extract contextual features is used to predict punctuation.The experimental results of Punctuation Prediction show the effectiveness of our method,and not only has higher prediction accuracy than the previous best method,but also reduces manual costs.2.Since there are many disfluencies in spoken Chinese transcribed text,which reduces the performance of punctuation prediction,we propose a method for joint Disfluency Detection and Punctuation Prediction based on deep pre-trained language network.The purpose of this method is to make use of the features of punctuation and disfluency to perform training prediction on the same deep pre-trained sequence labeling model,so as to improve the performance of each other's tasks.At the same time,we have labeled the dataset of disfluent words and punctuation joint labels.The results show that our method is effective. |