Font Size: a A A

Research On Fake Speech Recognition Method Based On Acoustic Characteristics

Posted on:2023-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhouFull Text:PDF
GTID:2568306827953059Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Fake speech recognition is an important research area of current intelligent speech technology research,which is an applied research direction integrating interdisciplinary intersection of information security,phonetics and artificial intelligence.With the frequent occurrence of telecommunication fraud cases in the current new social crimes,there is an urgent need for a method that can automatically and effectively distinguish the authenticity of speech.Based on the principle of phonetics,this paper integrates acoustic analysis techniques,speech signal processing,deep learning,image recognition and other techniques to deeply analyze the differences in acoustic characteristics between fake speech and real speech,proposes the Root Mean Square Energy Angle(RMSA)features,and fuses them with Fundamental Frequency Variation(FFV)features and Speech Narrowband Spectrogram(SNS)features to quantify acoustic characteristics,and combines the proposed Max Dense Convolutional Neural Network(MDCN)and Spec-attention Block(Spec-Attention)to achieve accurate recognition of fake speech.The specific innovations and research work are as follows.(1)We studied the differences in the acoustic characteristics of the fake and real speech.By comparing and analyzing the performance of fake speech and real speech in terms of acoustic characteristics such as fundamental frequency,sound intensity and spectrogram,the differences are analyzed and regular conclusions are drawn to explain the acoustic principle that fake speech can be recognized,which provides a theoretical basis for further automatic recognition.(2)We studied and designed an acoustic feature RMSA,which characterizes the degree of dispersion of sound intensity,quantifies and characterizes the difference in the Change rate of sound intensity between fake speech and real speech,and was fused with FFV features and SNS features as high-dimensional features for input to the recognition model,providing a new feature design idea for fake speech recognition.(3)We studied and designed a Max Dense Convolutional Neural Network model MDCN.In constructing the dense blocks of dense convolutional neural networks,the max feature mapping function is used,which provides a good model for improving classification recognition by preserving the dense connections of the model and reducing information forgetting,while also reinforcing the valid information in the content learned by the convolutional neurons.(4)We studied and designed an attention module called Spec-Attention Block.Based on the distribution of speech harmonic patterns and individual phoneme spectrum to slice the narrow-band spectrogram,the refined segmented results are selectively attended to from the spatial and channel dimensions,making the model more focused on the harmonic positions and spectrum breadth on distinguishable fake and real speech,enhancing the model’s perception of speech acoustic properties and further improving the recognition ability.
Keywords/Search Tags:fake speech recognition, acoustic properties, neural networks, attentionmechanisms
PDF Full Text Request
Related items