Font Size: a A A

Research On Speech Deception Detection Algorithm Based On Deep Neural Network

Posted on:2021-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:P Z LeiFull Text:PDF
GTID:2428330605952052Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The importance of deception detection is self-evident.Deception detection is of great significance in assisting public security investigation and preventing telecommunication fraud.In the past,the methods of deception detection often used professional and expensive instruments,and measured the physiological signal changes of the subjects as the basis for detection,such as pulse,blood pressure,and brain wave,etc.By contrast,it is more convenient and fast to use speech to detect deception,the characteristics of non-contact and non compulsion are not easy to cause exclusion and fear of the subjects,so it has the advantages of concealment and objectivity.However,research on speech-based deception detection is still in the beginning stage,and there are many problems to be solved.In recent years,deep neural network has shown excellent performance in image recognition,speech processing and other fields,which provides a novel idea for speech-based deception detection.Therefore,deep neural network is applied to speech-based deception detection in this thesis,and several problems existing in this field are researched.The main work is carried out from the following three aspects:(1)A full Chinese deception corpus is constructed.There are few existing corpus of deception,especially the lack of Chinese database.In order to supplement the corpus especially the Chinese deception corpus,this thesis refers to the existing experience of acquiring deception speech,"werewolf game" and "killer game" were selected as the background sources of speech and high-quality videos were selected from the Internet.Then,professional audio processing software was used to extract and cut out speech segments of several seconds from videos to build a Chinese deception corpus.After that,the features of these speech were extracted and recognized by a classifier,the preliminary exploration of speech-based deception detection was carried out.(2)An improved semi-supervised denoising autoencoder network(SS-DAE)is proposed and applied to speech-based deception detection.The current research on speech-based deception detection relies on a sufficient number of labeled data,however,due to theparticularity of deception speech,the cost of labeling data is very large.Therefore,this thesis applies semi-supervised learning to speech-based deception detection for the first time,aiming to use a small amount of labeled data for deception detection.Based on the existing semi-supervised autoencoder network(SS-AE),the activation function with better performance is selected,dropout is used to prevent over fitting and and the network structure is simplified.During the training,the labeled data and unlabeled data are utilized synthetically,and supervised learning and unsupervised learning are carried out simultaneously,so as to avoid the conflict caused by the two kinds of learning in the sequence.On CSC(Columbia-SRI-Colorado)corpus,the accuracy of using 1000 labeled data is 62.78%,and on our own corpus,using 200 labeled data is 63.89%.The results show that the proposed model can achieve the optimal performance with a small amount of labeled data.(3)The fusion feature is used for speech-based deception detection.In view of the problem that using a single type of feature will lose some information in the speech,which is not conducive to deception detection,this thesis utilizes different types of features to perform speech-based deception detection.Firstly,a parallel dual-channel structure of denoising autoencoder network(DAE)and long short term memory network(LSTM)is designed.Next,artificial features in speech are extracted and then input artificial features into the DAE to obtain more robust features,simultaneously,the Mel-spectrums are extracted after adding windows to the speech and framing,then input them into LSTM frame-by-frame for frame-level depth feature learning.Finally,the two types of features were merged by the fully connected layer and the batch normalization,and input into the classifier for recognition.The accuracy of CSC corpus is 65.18%,and that of our own corpus is 68.04%.Experimental results show that the proposed algorithm of fusion feature can achieve better recognition results.
Keywords/Search Tags:Speech-based deception detection, Speech feature, Deep neural network, Semi-supervised learning, Feature fusion
PDF Full Text Request
Related items