Font Size: a A A

Low-delay Target Person Reconnaissance Based On Voiceprint Recognition In Complex Background

Posted on:2024-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:H AiFull Text:PDF
GTID:2568307136494834Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Speaker recognition technology has increasingly become one of the important choices for surveillance equipment for target person reconnaissance.Compared with the traditional image recognition technology,the speaker recognition technology has many advantages,such as does not need the cooperation of the target person,can carry out long-distance recognition and is not affected by the visual environment,etc.,is a better solution for the target person reconnaissance.This thesis aims at low latency target person detection based on speaker recognition in complex backgrounds,and firstly designs a target person detection algorithm based on speaker recognition to improve the recognition of speakers.Then design a speech noise processing method for target person in complex background to reduce the interference of noise on target person reconnaissance.Finally,a low latency target person detection system is designed to integrate the above algorithms into a grounded application.The innovation of the work in this dissertation is mainly reflected in the following three aspects:(1)For the target person reconnaissance task,a target person reconnaissance algorithm SR-TPR based on speaker recognition is designed.The core purpose of the research on target person reconnaissance algorithm based on speaker recognition based on neural network architecture search is to automatically search for the best neural network structure and hyperparameter settings using neural network architecture search.For the purpose of information overload problem while improving the efficiency and accuracy of the task processing,attention mechanism is introduced and the loss function is improved.The experimental results of speaker recognition on the Vox Celeb1 dataset show that the target person reconnaissance algorithm SR-TPR based on speaker recognition designed in this thesis effectively improves the accuracy of voice recognition.(2)For target person speech noise processing,a speech noise processing method SNP-TPR for target person in complex background is designed.For the problem that Transformer neural network is weak in extracting fine-grained local feature patterns,the Conformer encoder is improved to adapt to acoustic sequences,and a two-stage I-Conformer is designed based on this block for extracting local and global contextual information.In addition,the loss function is improved.The experimental results of speech noise reduction on Voice Bank corpus and DEMAND dataset show that the speech noise processing method SNP-TP for target person in complex context designed in this thesis effectively improves the speech noise processing effect.(3)A low latency target person detection system is designed and implemented.According to the corresponding requirement analysis and design,the system is implemented by using the above mentioned target person reconnaissance algorithm SR-TPR based on voice recognition and the speech noise processing method SNP-TP for target person in complex background as the methods of corresponding functional modules.The system mainly consists of five modules: login registration module,target person registration module,target person reconnaissance module,system user management module and target person management module.And the system test shows that the system can ensure high accuracy while maintaining low time delay.
Keywords/Search Tags:Target Person Reconnaissance, Speaker Recognition, Speech Noise Processing, Neural Architecture Search, Conformer
PDF Full Text Request
Related items