Font Size: a A A

A Study Of Speech Syllable Evoked Auditory Brainstem Response On Electrophysiological Characteristics And Speech Recognition Assessments

Posted on:2016-06-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q Y FuFull Text:PDF
GTID:1224330482956553Subject:Otolaryngology science
Abstract/Summary:PDF Full Text Request
Background:Recognizing speech information and communicating are important daily behaviors of human in our society. The speech recognition processing not only requires enough auditory thresholds, but also needs adequate ability of attention and orientation for auditory signal, which results in selectively listening to verbal information and lead to its analysis, collation, association and retention. Therefore, speech recognition processing forms a complex functional network which covers many aspects such as awareness, perception, reasoning and judgment based on the human’s auditory system, psychology and brain. Although our understanding of speech recognition processing is very deep at present, but we still can not touch the whole mechanism and internal details of speech recognition neural network clearly.When auditory system accepts a speech sound, auditory hair cells will translate the sound information into neural signals by transduction mechanism, and then the neural signals will be imported to several auditory central gradually, and finally they are formed a sense in the cerebral cortex and cause the behavior reaction. In this auditory process, the auditory functional what-where pathway performs a separation processing of speech information which is an important basic theory of speech recognition mechanism. Contrasting to the contralateral dominance of auditory physiological pathway, the auditory what-where functional pathway performs a different parallel conduction in speech processing. Speech sound begins with a vibration wave of vocal cords caused by human lung airflow, and then it is packed with information by a common role of acoustic resonator (such as laryngeal cavity, pharyngeal cavity, oral cavity and nasal cavity) and articulation organs (such as tongue, lips and teeth). Therefore, the speech signal can be divided into semantic information part, such as vowel and consonant, from common action with resonance apparatus, and non-semantic information part, such as age and gender, related to vibration characteristics of vocal cords. Based on this two parts of speech signal, auditory what-where pathway theory presumes the speech information parts are performed respectively with a parallel processing, which what-pathway is in charge of non-semantic information treatment, and where-pathway takes responsibility of semantic information processing. It is deemed that speech evoked auditory brainstem responses (speech-ABR) is an important auditory electrophysiological technique to detect the neural activity of what-where pathway during the speech recognition processing, which common measurement comprises latencies of seven feature peaks including wave V, A, C, D, E, F and 0, and amplitudes of fundamental frequency (F0) and the first formant (F1). In these detection indexes, latencies of wave D, E, F and amplitude of Fo belong to speech-ABR periodic component, reflecting the nerve potential of what-pathway processing with speech non-semantic information part; meanwhile latencies of wave V, A, C,0 and amplitude of F1 belong to speech-ABR transient component, reflecting the nerve potential of where-pathway processing with speech semantic information part. In recent years, speech-ABR has become to be a hot auditory electrophysiological technique in the cognitive psychology field, but it is still not enough to attract attention of scholars on speech recognition research.Objective:The purpose of this article is to investigate the electrophysiological characteristics of speech-ABR and its significance in speech auditory processing at young adults with normal hearing (NH), and to analyze the correlation between these physiological characteristics and maximum percentage-correct score obtained on phonetically balanced monosyllabic word-recognition measures (PBmax), and meanwhile analyze the effect on the correlation at different hearing state, so as to provide a cue for using these speech-ABR characteristics to assess speech recognition function objectively in clinic. Details are described as five chapters.Methods:Subjects in the article included NH and young patients with conductive hearing loss (CHL) and with sensorineural hearing loss (SNHL). Speech syllables Idal with three intensities (80,60 and 40dB HL) were presented monaurally to 29 NH, and corresponding speech-ABRs were recorded in chapter one. The latencies and amplitudes of seven feature peaks such as wave V, A, C, D, E, F and O in time domain, as well as F0 and formants from the first to fifth one (first-fifth formant, F1~F5) in frequency domain, which were obtained by Matlab software platform and the fast Fourier transform method were analyzed statistically to investigate the effect of stimulation intensity on subcortical auditory processing of speech and assess subcortical asymmetry of speech elements at the brainstem level using speech-ABR, so as to provide more clues for the mechanism of speech cognitive behavior. Effects of two stimulus rates (11.1/s and 20.1/s) and two stimulus intensities (80 and 60 dB HL) on speech-ABR were observed with 24 subjects (24 ears) with NH by 2×2 factorial experimental design in chapter two. Speech-ABR waveform quantitative score value, latencies of the feature peaks, amplitude of F0 and amplitude of F1 were analyzed to study the differences on the test results of speech-ABR with several common variable combinations and report the affecting factors of its electrophysiological characteristics. Thirty-three subjects (33 ears) with NH were included in chapter three. Speech-ABRs of all subjects were recorded at 80dB HL intensity. Meanwhile, auditory mismatch negativity (MMN) was recorded with 1 kHz frequency deviant extent and 40 dB intensity deviant extent. The electrophysiological characteristics of speech-ABRs and MMNs, as well as the relationships of MMN latencies between speech-ABR parameters including latencies in time domain, F0 and F1 in frequency domain were analyzed statistically to investigate the electrophysiological relationship between auditory brainstem neurons and auditory cortical neurons, so as to provide more electrophysiological mechanism for information transfer channel of auditory brainstem functional pathway and regulation system of auditory cortex efferent nerve in speech information processing. Forty-one subjects (41 ears) with NH were included in chapter four. The speech discrimination scores were obtained by Mandarin phonemic-balanced monosyllable lists via speech audiometric software. Their speech-ABRs were recorded with the intensity of PBmax. The electrophysiological characteristic of speech-ABR, as well as the relationships between PBmax and speech-ABR parameters including latency in time domain, F0 and F1 in frequency domain were analyzed statistically to investigate the relationships between electrophysiological characteristic of speech-ABR and PBmax, so as to provide more clues for the mechanism of speech cognitive behavior. In chapter five, the subjects were divided into three groups including 30 CHL (30 ears),27 SNHL (27 ears) and 41 NH (41 ears) with investigated in chapter four. The speech discrimination scores of all subjects were obtained using the methods similarly to chapter four. Both the values of speech-ABR and PBmax were acquired from all subjects. The electrophysiological characteristic of speech-ABR as well as the relationships between PBmax and speech-ABR parameters including latency in time domain, F0 and F1 in frequency domain were analyzed in statistics to investigate the relationship between the electrophysiological characteristic of speech-ABR and PBmax at different hearing states, and so as to evaluate the clinical application of speech-ABR on the speech recognition behavior.Results:The results showed that with the increase of stimulus intensity, significant shorter latency and larger amplitude were observed for the feature peaks of speech-ABR (p<0.05) in chapter one. The latency changes corresponding to the same intensity increment were at variance with periodic components and transient components. Fundamental frequency and formants of the stimulus syllable that extracted preferably from speech-ABRs were encoded less vigorously in ascending order which is the same as that for the intensity. The responses to the right and left monaural stimulation were resemblant without significant lateralized difference (p>0.05). In addition, latencies of these feature peaks showed much smaller coefficient of variation (1%-14%) than amplitude of those feature peaks (31%-83%). When speech intensity reduced at 20dB HL step, latency changes of feature peaks belonged to speech-ABR periodic components were closed each other, and latency changes of those belonged to transient components were similar closely too. However, the average values of latency changes were different between the two speech-ABR components, which supported speech-ABR can reflect the different nerve action potentials between what-pathway and where-pathway. Therefore, the speech-ABR was able to encode faithfully the speech sound in terms of timing and spectrum components with high correspondence to the stimulus intensity. The different latency characteristics of periodic components and transient responses may imply more underlying coding information of speech sound. However, lateral asymmetry speech processing is not considerable at the brainstem level. In addition, latencies of feature peaks, amplitude of F0 and amplitude of F1 in these speech-ABR electrophysiological properties may be the better objective measures of speech recognition research.The results of factorial analysis in chapter two showed that there were no significant difference between the stimulus rates on speech-ABR’s waveform quantitative score value, latencies of feature peaks, amplitudes of F0 and amplitude of F1. But these characteristics of speech-ABR except for latencies of peak A, F and O showed significant difference between the two stimulus intensities. In addition, it was no effect on interaction with stimulus rate and stimulus intensity. So there is no interaction effect on the test of speech-ABR in a certain range of stimulus rate and stimulus intensity, which offer a cue to select the suitable stimulus parameter for speech-ABR test as helpful to shorten the test time and reduce the discomfort.In chapter three, MMN latency of frequency deviance showed a negative correlation tendency with speech-ABR transient components, and a positive trend with sustained components of speech-ABR. While MMN latency of intensity deviance showed a positive correlation with speech-ABR latency of peak V, A and D respectively, and a negative correlation with speech-ABR latency of other peaks and amplitude of F0 and F1 respectively. Only the Fo amplitude of speech-ABR and MMN latency of intensity deviance were moderate correlation in statistics (r=-0.350, p=0.046). It suggests probably the neurons of frequency deviant MMN have an unmatched characteristics of frequency with the neurons of speech-ABR transient component, and have a matched characteristics of frequency with the neurons of speech-ABR sustained component on the contrary. Similarly, the neurons of intensity deviant MMN have probably a matched or an unmatched characteristics of intensity with neurons of different components of speech-ABR. Therefore, auditory brainstem functional what-pathway may be facilitated in speech frequency characteristics by the auditory cortex downlink system, and the where function pathway may be restrained on the contrary. Similarly in magnitude characteristics, auditory brainstem functional what-where pathway may also be facilitated or restrained by the auditory cortex downlink system respectively. These results may be formed as a valuable clue to further investigate the abilities of speech perception and temporal processing.Results of chapter four showed that subjects completed a good speech perception test (PBmax=94.80±6.01). While dividing the subjects into three groups by PBmax1=100%,90%<PBmax2<100% and 80%≤PBmax3≤90%, the results showed a significant differences of speech-ABR parameters including latencies of all feature peaks, amplitudes of F0 and F1 (p<0.05), which with PBmax decreased, amplitudes of F0 and F1 decreased, and latency of feature peaks increased in addition to peak C. It was no significant difference of peak C latency between PBmax2 and PBmax3 (p>0.05). Amplitudes of F0 and F1 showed a strongest significant positive correlation with PBmax, whose correlation coefficients were 0.712 and 0.733 respectively, and latencies of all the feature peaks showed a significant negative correlation with PBmax, which coefficient correlation of peak C and O were -0.413 and -0.324 respectively and were much smaller than the one of other peaks. Therefore, these electrophysiological characteristics of speech-ABR show a closely associated with Mandarin monosyllable discriminative abilities, which may be formed as an objective auditory measure to assess speech recognition, and may be applied combinatively or separately to further investigate the abilities of speech perception and temporal processing.In last chapter, results showed that subjects with CHL and SNHL completed satisfactorily the speech perception test too, whose PBmax was 84.33±16.98 and 80.89±17.57 respectively, and the PBmax values were found no significant differences between the CHL and SNHL subjects (p>0.05), but were significantly less than those the one of the NH subjects (p<0.05). While dividing the subjects into three groups by 90%<PBmaxi≤100%,80%<PBmax2≤90% and 70%<PBmax3≤80%, the results showed that as the PBmax values reduced, all subjects exhibited decreased amplitudes of F0 and Fi in the speech-ABR spectrum, and increased latencies in the featured peaks of speech-ABR. The latencies of speech-ABR featured peaks were gradually prolonged in subjects of SNHL, CHL and NH in the same PBmax group.3 × 3 factorial analysis showed that effects of different hearing types and different PBmax groups on latency of peak C were no significance in statistics (p>0.05), and effects of different PBmax group on latency of peak F was no significance too (F= 2.734, p= 0.070). There was an interaction effect on amplitude of F1 between hearing type and PBmax group (F= 2.767, p=0.032). When using one-way ANOVA for multiple comparisons of speech-ABR measures (expect for latencies of peak C) between different hearing types at each PBmax group, it showed that the significant results in statistics of all speech-ABR measures between different hearing types was different in PBmaxi and PBmax2 group, and there were no significant results of all speech-ABR measures between different hearing types in PBmax3 group (p>0.05). Correlation analysis showed that the latencies of speech-ABR peaks in all groups were negatively correlated to the PBmax values, and the amplitudes of F0 and F1 were strong correlated to the PBmax values respectively (values of r at range of 0.71 and 0.93, p<0.05). The stepwise regression analysis indicated that the latencies of wave-A and wave-F, the amplitudes of F0 and F1 accounted in total for the PBmaxas large as 76%. The grade of the influence was gradual decrease in an order from F0 amplitude, F1 amplitude, wave-F latency to wave-A latency.Conclusions:Based on the above results of experiments, it can be summed up several key conclusions: ① The speech-ABR is able to encode faithfully the speech sound in terms of timing and spectrum components with high correspondence with the stimulus intensity. In these electrophysiological characteristics of speech-ABR, latencies of feature peaks, amplitude of F0 and amplitude of F1 may be the objective indicator of speech recognition assessment. ② Lateral asymmetry speech processing was not considerable at the brainstem level, which showed a different mechanism in contrast with the asymmetry of cochlea level and cerebral hemisphere. ③ The characteristics of speech-ABR are stable at a range of stimulus rate and stimulus intensity selected in present experiment, which offered a cue to select the suitable stimulus parameter for speech-ABR measurement. ④ The auditory brainstem functional what-pathway may be facilitated in speech frequency characteristics by the auditory cortex downlink system, and the where function pathway may be restrained on the contrary. Similarly in magnitude characteristics, auditory brainstem functional what-where pathway may also be facilitated or restrained by the auditory cortex downlink system respectively. ⑤ These electrophysiological characteristics of speech-ABR show a closely associated with Mandarin monosyllable discriminative abilities, which may be formed as an objective auditory measurement to assess speech recognition, and may be applied combinatively or separately to further investigate the abilities of speech perception and temporal processing. ⑥ A stable correlation was found between the electrophysiological characteristics of speech-ABR and the Mandarin monosyllable discriminative abilities for subjects with different hearing states. Some factors such as amplitude of F0 and F1, and latency of peaks F and A latency may contribute more to the clinical evaluation of PBmax, as such, they may become a fundamental electrophysiological approach to evaluate the speech perception. These important findings will enrich the electroneurophysiological knowledges of speech-ABR, and may promote a development to speech-ABR basic researches and clinical applications.
Keywords/Search Tags:Speech syllable, Electrophysiology, Speech audiometry, Auditory brainstem response, Auditory mismatch negativity, Correlation coefficient
PDF Full Text Request
Related items