Font Size: a A A

Research And Application Of Phoneme Recognition Technology

Posted on:2021-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:X FengFull Text:PDF
GTID:2518306338485904Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Phonemes are the smallest speech units divided according to the natural attributes of speech,and phoneme recognition technology is a pattern recognition technology that recognizes its phoneme sequence from speech audio.Phoneme recognition has undergone decades of development.From the traditional hidden Markov model to the current codec model based on attention mechanism,phoneme recognition technology has become more mature and the recognition performance has been raised to a high level.It needs to be further clear that the phoneme recognition technology can be used as a supporting technology,and its reasonable application can effectively improve the performance of other systems.This thesis focuses on two aspects of research,one is the improvement and optimization of end-to-end phoneme recognition technology,and the other is the application of phoneme recognition technology to query-by-example keyword detection system.In view of these two research directions,the main work completed in this paper can be summarized into the following three aspects:1.This thesis researches and optimizes end-to-end phoneme recognition technology based on codec model.In this paper,the codec model incorporating the attention mechanism is systematically implemented,and the Word2vec system is innovatively used to improve the Embedding mechanism in the original system.In addition,in order to make up for the lack of training data,a data supplement method based on the inverse mapping idea is designed.At the same time,a corrective training step is introduced in the system development process,which can effectively improve the phoneme recognition system.2.In the paper,the features of the keyword detection system are innovatively generated using phoneme recognition technology,and the query-based keyword detection system is developed based on image recognition technology.In this thesis,the above phoneme recognition system is used to extract the phoneme vector features,and then the correlation calculation is used to convert the phoneme vector features into feature images.Then,the feature image processing is completed using image recognition technology based on deep learning,and the examples of keywords are finally completed Inquire.This paper has proved the feasibility of the system through system performance experiments.In order to evaluate the performance of the developed system,this paper also designs a contrast scheme using the phoneme posterior probability spectrum to generate feature images.This comparison system adopts the phoneme recognition technology of multi-layer perceptron to generate the posterior probability spectrum of phonemes required for keyword detection,and then uses image recognition technology to perform keyword detection.In this paper,through system performance experiments,the expected experimental results can be achieved.By comparing the performance of the two systems,it is proved that the performance of the keyword detection system based on the phoneme vector is better than that based on the posterior probability spectrum.3.On the basis of the phoneme recognition system completed in this thesis,a query-by-example keyword detection system is further designed and implemented using template matching.The system borrows the ideas from the D-vector speaker recognition algorithm,relies on the above phoneme recognition system to generate the summary features of the keyword detection system,and then uses a sliding window-based template matching method to determine the existence of keywords in the data to be tested and Ability to locate where it is.After system performance experiments,the system can achieve the expected performance results.At the same time,this paper compares the performance of the template matching system with the previous two keyword detection systems based on image recognition technology,and analyzes the differences between the systems.
Keywords/Search Tags:Word2vec, Inverse mapping, Correction training, Phoneme vector, Keyword detection
PDF Full Text Request
Related items