| As the legal system in China has developed and recording devices have become more commonly utilized,forensic speaker recognition is showing up in courtroom arguments more frequently.Chinese forensic speaker recognition technology has made encouraging advancements since the end of the 1980 s,when China began the research and application of judicial recording inspection technology.However,there are still some fundamental theoretical areas of application that require in-depth study.One of most urgent issue is speaker recognition between Mandarin and Chinese dialects.In reality,the language used when committing crimes or violating the legal rights and interests of others(recorded material from unknown speaker)and the language used when a sample voice(recorded material from the suspect)is collected for comparison are frequently different due to the enhancement of the ability of the suspect and defendant to thwart investigations under the influence of the psychology of profit and harm avoidance.The recorded material from unknown speaker is dialect or Mandarin and the recorded material from the suspect is Mandarin or local dialect,resulting in forensic speaker recognition can only be made between dialect and Mandarin voice.China is a large country with a diverse range of dialects.There is an urgent need for a broadly applicable method of judicial speaker recognition between Chinese dialects and Mandarin since it can be challenging to immediately apply study findings from one dialect region to other dialect regions.The feasibility and effectiveness of forensic speaker recognition based on vowel acoustic space,forensic speaker recognition between Chinese dialects and Mandarin based on spatial distribution,and forensic speaker recognition between Chinese dialects and Mandarin based on spatial metrics were analyzed and discussed in this study.Using vowel as the comparison object,vowel formant data as the acoustic parameter,and vowel acoustic space as the acoustic model,the study’s comprehensive "Auditory Phonetic cum Acoustic Phonetic Analysis" discriminant approach is used to empirically analyze and discuss the forensic speaker recognition technology between Chinese dialects and Mandarin.The main findings are as follows.(1)Vowel acoustic space characteristics of 11 linguists and the vowel distribution positions of 50 Dutch males in Praat voice analysis software were analyzed to investigate the viability of using vowel acoustic space to forensic speaker recognition.The spatial distribution relationships were found to be stable and distinguishable,and suitable for application in qualitative studies of forensic speaker recognition.A review of the literature reveals that acoustic spatial metrics have the potential to be used in quantitative investigations of forensic speaker recognition since they can represent a variety of speech variations across speakers,including gender,age,speech rate,intelligibility,and speaking style.(2)The vowel consistency between 104 dialect points in Shanxi dialect and Mandarin was examined to determine whether it would be feasible to use vowels as a comparison object for forensic speaker identification between Chinese dialects and Mandarin.About 90% of the dialect points have up to 8 vowels in common with Mandarin,according to the survey’s findings.The use of vowels as a comparison object for forensic speaker recognition between Chinese dialects and Mandarin is therefore supported by the existence of sufficient numbers of identical vowels.However,about55% of the dialect points differ from Mandarin by more than five vowels,and the identification value of different vowels should be fully explored to extend the range of comparable selection of characteristic segments.Additionally,vowel nasalization is a widespread phenomena across all dialect sites.It is important to consider the challenge of distinguishing the nasal and oral formant of nasalized vowels while creating the vowel acoustic space.(3)This paper uses computer speech workstations to edit the formant of speech samples,and then uses the generated speech samples to construct different control groups for listening and discriminating,in order to solve the problem of how to accurately distinguish the oral and nasal formant of nasalized vowels when using nasalized vowels to construct vowel acoustic space in forensic speaker recognition.The findings demonstrate a pattern in the speech changes of oral and nasal formant following attenuation,and this technique can discriminate oral and nasal formant of nasalized vowels reliably.The combination of "formant editing" and "auditory perception" established in this paper can provide a basis for acoustic characterization of oral and nasal formant in the fields of forensic speaker recognition,voice perception,voice recognition and speech therapy.(4)The vowels of speaker dialect and Mandarin are placed in the same vowel acoustic space,and similarities and differences between the same vowels and different vowels between them are counted in order to address the issue of the lack of identical segments available for comparison when dialectal disguised speech is used for forensic speaker recognition.When the four indicators of vowel spatial contour,vowel similarity,vowel position relationship,and variability of nasalization phenomena were analyzed collectively,it was discovered that speakers had better individual stability and inter-individual variability in vowel acoustic space.The discriminant method based on the relationship between vowel acoustic spatial distribution is one of the current qualitative analysis ways to successfully address the recognition of speakers using dialectal disguised speech,as well as one of the ways to successfully address the insufficiency of same phonetic segments available for comparison in forensic speaker recognition.(5)This research first examines the discriminant capability of nine vowel acoustic space quantifiers and determines the optimal discriminant index for each vowel acoustic space(vowel acoustic space composed of different numbers of peripheral vowels)in order to address the application of vowel acoustic space quantifiers for forensic speaker identification between Chinese dialects and Mandarin.After that,a multi-metric intersection discriminant approach was suggested,and when the confusion rate was only about 5%,it produced recall rates of about 90%,89%,85%,and 89%,respectively,in the five-vowel,seven-vowel,and ten-vowel spaces.Finally,factor analysis was utilized to generate a comprehensive factor score model for each vowel acoustic space,and a comprehensive factor score discrimination approach was provided.This led to a recall rate of over 90% in the seven-vowel and ten-vowel spaces,86% and 89%,respectively.This study has a certain innovative value for promoting the intersection of dialectology and other disciplines;the conclusions obtained are of great practical significance for further improving the theory of speaker identification between Chinese dialects and Mandarin,and for solving the problems encountered in the process of handling cases;it is of great guidance value for combining dialect research with practical applications,and for better utilizing the social benefits of dialect research results;it is also of great significance for the study of speaker recognition between Chinese and other languages. |