Font Size: a A A

Research On Deception Detection Based On Speech

Posted on:2020-06-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y XieFull Text:PDF
GTID:1368330611955314Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Physiological parameters will vary with the psychological changes caused by stress when a person is lying,such as skin electricity,brain electricity,blood pressure,vocal cord system,etc.Generally,these parameters are only restricted by autonomic nerves and are difficult to be controlled by consciousness.The method of evaluating the verbal truth of a speaker based on these physiological parameters is called speech confidence evaluation,which is popularly known as deception detection.The early deception detection was mainly based on multiple physiological parameters.However,this method requires wearing various professional equipment to detect changes in physiological parameters,requiring a high degree of matching of the test objects,and thus it is difficult to promote in practical applications.Therefore,some scholars have recently conducted research on speech confidence evaluation based on non-contact indicators such as speech.However,there are still many problems in such methods to be further studied:(1)the influence of acoustic characteristic parameters on speech confidence;(2)effective model and algorithm for speech confidence evaluation with acoustic features.In view of the above problems,this paper focuses on database establishment,feature enhancement and model establishment,and mainly does the following works:1.Due to the lack of deception corpus related to psychological pressure,this topic designed relevant experimental scenarios and recorded database under different psychological pressure.Under the lower psychological pressure,the scene in which the testee participates is experimental,and the lies in the experiment process will not have a greater impact on the liars themselves.Under the higher psychological pressure,the testee does not know the experiment.Feelings,the stated lies have a direct greater impact on the actual interests of the subjects.Based on the above two databases,this paper analyzes the ability of various acoustic features to distinguish between lies and honesties under different psychological pressures,and proposes to replace the static features of fixed dimensions with dynamic features of variable dimensions to preserve the temporal information for mining the dynamics of lies.2.Under the premise of ensuring the recognition ability of the original speech confidence model,in order to reduce the computational complexity of the model,this paper uses the long short-term memory(LSTM)network to deal with the dynamic temporal speech features,and proposes two methods of attention gate to replace the traditional forgetting gate.Contrary to the forgetting gate,the attention gate focuses on the effective part of historical information rather than forgetting the invalid part.The self-attention gate only performs weighting operations on the historical cell state,and pays attention to the validity of historical information at the current moment.Another additive attention gate performs weighting operations on historical cell states and candidate cell states to complete the update of cell status.In the new algorithm,the forgetting gate and the input control gate in the original LSTM network are deleted,and the dimension of the weight matrix is reduced,thus reducing the computational complexity.Experiments show that compared with traditional LSTM,the computational complexity is reduced without sacrificing the accuracy of lie recognition.3.In order to distinguish the difference between the time dimension and feature dimension of LSTM's output for detecting deception,this paper proposes attention weighting methods in these two dimensions to distinguish the information quantity of lies in different time segments and the ability of different features to lie recognition.In the time dimension,since the LSTM has the ability to memorize information,its output at the last time contains a wealth of task-related information.In order to ensure that it can be assigned a larger weight,this paper uses the output of the last time as the reference information to complete the weighting on different time segments.In the feature dimension,the attention scores in the new deep feature dimension space are calculated first,and then summed in the time dimension to obtain the statistical characteristics of the features at the time level.Experiments show that both methods can effectively enhance the key information in the feature and improve the performance of deception detection.4.In order to reduce the influence of individual vocal cord system on deception detection,this paper proposes a speech confidence recognition model based on pseudo-speaker information.The method firstly performs unsupervised clustering on the input lie features,obtains the label information of the pseudo-talker,and implicitly pre-classifies the individual vocal cord characteristics.In order to make effective use of this information,this paper uses these tag information as input switch information to determine the flow direction of the upper layer network output,and separately trains each type of speaker in the upper layer network,while the underlying network fixes parameters through transfer learning algorithm to reduce the cost of training time.The experimental results show that the model can improve the accuracy of speech confidence evaluation based on the differences of pseudo-speakers.These efforts have promoted the study of speech-based non-contact speech confidence,laying the foundation for realizing speech confidence detection tools that are practical and less dependent on devices and individuals.
Keywords/Search Tags:Deception detection, Dynamic speech features, Long Short-term Memory, Attention Mechanism, Pseudo-speakers
PDF Full Text Request
Related items