Font Size: a A A

Design And Implementation Of Korean Speech Retrieval System

Posted on:2022-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:B W XuFull Text:PDF
GTID:2518306338956159Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of big data,speech retrieval research become a hot spot along with massive and urgent demand for speech data retrieval,requirement of improved the performance of speech retrieval systems.Inspired with development of large vocabulary speech recognition methods and technologies,speech retrieval technology gradually changed from the traditional low-level speech feature matching method to the major method based on speech recognition,and lead it into a new era.At present,in China,primary objective research of speech retrieval is for spoken Chinese,oppositely the spoken Korean retrieval research is seriously less developed.So,this dissertation aims to Korean speech retrieval,proposes a Korean speech retrieval method,designs and implements Korean retrieval prototype system.First,with the purpose of construction Korean acoustic model,by substituting bidirectional and unidirectional LSTM with GRU in Ko Speech framework,an improved Korean acoustic model network structure is proposed.In contrast with LSTM,GRU is more effective for its adventages of simple structure,better performance.The improved model fine-tune experiments results demostrated that CER is minimal decay for improved model,but parameters scale and training efficiency are evidently better than Ko Speech.After data cleaning preprocess,total of 400 hours training data sets are refined for training,and the improved model reached to 24.5% CER on testing data sets.Second,method of construction Korean speech documents index library based on speech segmentation is proposed.Speech documents need to be segmented into sequences of clips due to incapable holistic transcription of the improved Korean acoustic model.With our segment approach,clips are all about5 seconds time duration,it ensures the semantic integrity of each clip as prossible.At the same time,clips are used as the entries in index libray,can obtain more accurate time location for speech retrieval system.Third,Korean speech retrieval achieved by using Levenshtein-Distance based text retrieval method.In this our method,transcription of query speech is scored by fuzzy matching with transcriptions which correspond to entities in index labrary.And the results are evaluated by top-k m AP and recall-rate.When k=1,m AP=81.75% and recall-rate=81.75%.Best performance is achieved when k=9,and m AP and recall-rate reached to 86.74% and 95.25% respectively.Finally,the prototype system of Korean speech retrieval is designed and implemented,and all the core modules have passed the system test.The system has core functionalities of Korean speech retrieval and speech index libray maintenance and management,and the system is developed in Flask framework,running in Browser/Server mode.Korean audio retrieval method of this dissertation research and design of retrieval prototype system,adopt major speech recognition approach and software development technology,take account of Korean literal and pronunciation characteristics,propose an improved network structure of Korean acoustic model,and Korean documents segmentation method.Related experiment and system test results show that,the proposed method has a certain academic significance,and the prototype system has a certain practical value.
Keywords/Search Tags:korean speech retrieval, speech retrieval, speech recognition, acoustic model, text retrieval
PDF Full Text Request
Related items