Font Size: a A A

Study On Automatic Generation Of Chinese Course Video Subtitles

Posted on:2017-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y L HuiFull Text:PDF
GTID:2348330488969857Subject:Agricultural Extension
Abstract/Summary:PDF Full Text Request
Video subtitles is a auxiliary tool for understand the content of video, with the development of the Internet, video subtitles are playing an increasingly important role. This paper studied the problem of the automatic generation of video subtitles and the technology principle of the extraction of audio stream from course video, the segmentation of audio stream, speech recognition, the generation of text format files, the Chinese speech recognition technology is discussed emphatically.The process of Chinese speech recognition includes four parts: feature extraction, acoustic model, language model and pattern matching. the related technologies which were used in these four parts are compared and analyzed, then choose MFCC, HMM, N-gram and related algorithms to study Chinese speech recognition and described the MFCC feature extraction method, HMM acoustic model and related algorithms, and the N-gram language model and smooth processing methods in detail.In the light of the rules of Chinese pronunciation, this paper put the initials and finals as the phonemes and combined with the Sphinx speech recognition system which is developed by Carnegie Mellon University to establish acoustic model, language model and the construction of a dictionary. The HMM is used in acoustics modeling, the N-gram statistical model is used in language modeling, the format of the dictionary is a statement corresponds to a set of phone. In the process of modeling, embodied 30 thousand audio files in total nearly, the corresponding entry is also nearly 30 thousand. This paper also described the acoustic modeling and language modeling process in detail, in the process of acoustic modeling, the emphasis is on the data preparation work before the modeling and the training process, in the process of language modeling, the emphasis is on the model training process.Through the establishment of corpus, the research of sphinx speech recognition system, the design and development of the subtitles generation system to build a automatic generation system of subtitles finally. The test and contrast experiments show that the Chinese recognition rate of the automatic generation system of subtitles in this study is about 51%. By the analyzing and summarizing, the corpus is the most important factor that restricts the recognition rate of this study.
Keywords/Search Tags:subtitles, speech recognition, extract parameters feature, acoustic model, language model
PDF Full Text Request
Related items