Font Size: a A A

The Research Of Deceptive Speech Analysis And Detection

Posted on:2017-12-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y PanFull Text:PDF
GTID:1318330512957542Subject:Signal and information processing
Abstract/Summary:PDF Full Text Request
Based on digital speech processing technology, the main research of this paper is focused on the deceptive speech analysis and detection. Using digital signal processing method to analyze the information contained in the speech signals(such as semantic, identity, emotion, etc.), is an important achievement in the development of computer information processing technology. On the basis of these successful achievements, speech signal processing technology based psychophysiology computing research is carried out in recent years. These research projects are integrated by physiological science, psychological science and information & computer science.The EEG signals processing(P300 signal analysis) and brain functional magnetic resonance imaging(f MRI) are the most widely used methods in lie detection. Own to the support of neuroimaging mechanism, the good results are achieved in a certain extent. These methods may be invalid in some applications which are lack of memory information. The complex measurement process and the needs of the participant's cooperation make their further promotion limited. Now, the EEG analysis and neuroimaging polygraph results are only used as references in forensic and judicial fields.In recent years, facial expression analysis and natural language processing technology are successfully used in lie detection applications, due to the development of video analysis theory and probabilistic graphical model theory. With the promotion of language acoustic, auditory phonetics and language physiological studies, speech signal processing based lie detection is become a hot topic again by worldwide scientists. Owning to the development of digital signal processing theory, Psychological stress evaluators(PSE), Voice stress analyzers(VSA) and Layered voice analysis(LVA) technology are endowed with new meanings and connotations. But the accuracy of the most of the deceptive detection systems is only between 60% and 70%. The shackles of the voice polygraph technology are exposed when the research work promoted much deeper. 1, there need special speech features for lie detection in order to make the deceptive information more prominent. 2, there is lack of time series model to make the time dynamic characteristics of deceptive speech been fully taken into account. For overcoming these shortcomings, the computability of deceptive speech detection, matching feature expression and time series modeling research are carried out in this paper. The main works are shown as follows.1. The speech feature distribution differences are used to prove the existence of deceptive information in the speech signals. The distribution of traditional speech features is analyzed to discover the difference between the normal speech and deceptive speech. The dissimilarity function is proposed to quantity the differences between the normal speech and deceptive speech. Then the existence of deceptive information estimation is to be proved. It provides a feasible basis for speech signal based deceptive detection.2. The instant frequency of auditory band is proposed as the deceptive speech detection characteristics. The manner of articulation may be changed under stress, and this is the main physiological basis of deceptive detector. Relevant research results show that the acoustic signal processing method based on auditory mechanism is suitable for dealing with such problems. The auditory Gammatone filter banks are introduced to decompose the speech signal. The instant frequency of each frequency band is calculated by lattice iterative algorithm. The differences between the instant frequencies may enhance the changes of vocal fords of human beings in normal or lying state. The deceptive information in speech signal may also be strengthened. The deceptive state will be much more easily identified by mathematical models. The results show that the introduction of the auditory band instant frequency characteristics makes the individual accuracy of speech detection increased by about 2% to 10%.3. The fractional Mel cepstral coefficients(Fr CC) are proposed to make the speech characteristics more robustness. Due to the weak deceptive information contained in speech signals, the parameters should be not only sensitive to the deceptive information but also robustness during the extracting process. The fractional analysis is used to optimize the MFCC parameters, so the Fr CC is not only retains the robustness of MFCC for speech information expression, but also contains the phase information of the speech signals. The personality of the speaker may be retained a little more. The lying state of the speaker is more likely to be discovered. The experimental results show that the introduction of Fr CC parameters has played a significant role in improving the correct rate of deceptive detection.4. The Multi-scale conditional random fields(MCRF) model is proposed. The MCRF model archived the work from acoustic characteristics extraction, to rhythmic layer information abstraction, and to physiological and psychological state chain estimation. The time series analysis model is established with the integration of the above steps. This model expands the speech signal's global context related information, and plays the important role in compensating the weak deceptive information. The performance of recognition system is significantly improved, and the final accuracy reaches more than 75%.This work can be regarded as the preliminary research results in parameters and models of speech signal based deceptive detection. They also provide a basis for psychophysiology computing research in the fields of digital signal processing.
Keywords/Search Tags:Deceptive Speech Detection, Instant Frequency, Fractional Mel Cepstral Coefficient(Fr CC), Multi-scale Conditional Random Fields(MCRF)
PDF Full Text Request
Related items