Font Size: a A A

A Study Of Pathological Voice Feature Extraction And Classification

Posted on:2021-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y PengFull Text:PDF
GTID:2514306041460724Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Various factors such as improper use and excessive use make modern voice diseases common,which means that the vocal cords cannot vibrate regularly,the sound quality is poor,and normal life is affected.Therefore,the clinic attention to early evaluation of pathological voices has become more and more important.In the current field of diagnosis and treatment,expert auditory perception and invasive laryngoscope are the most mainstream examination methods.The former depends on the subjective experience of the expert,while the latter will bring pain to the patient.So people try to analyze the voice signals in an objective and non-invasive way with the help of computers to realize the automatic evaluation and classification of pathological voices,and to provide a unified quantitative standard for the evaluation of voice quality.Based on the research trends at home and abroad,this thesis starts with acoustic feature parameters and pattern recognition,and uses a combination of signal processing and deep learning,the deep neural network is trained to automatically extract features from the spectrum of the voice signal,and through the recognition of the classifier,the study of the classification of pathological voice disorders is completed.The main research contents of this thesis are as follows:1.A variety of related acoustic characteristic parameters of pathological speech signals are analyzed,covering traditional fundamental frequency,amplitude,glottic wave-based,spectral,and cepstrum parameters.After simple preprocessing of the speech signal,from the voice signal of continuous vowels,the Mel Frequency Cepstrum Coefficient(MFCC)and its first and second order dynamic coefficients,as well as multidimensional cepstrum-type combination parameters,including Mel Frequency Cepstrum Spectral coefficient(MFCC)and cepstrum peak feature(CPPS)to form two feature parameter sets.2.Utilizing the joint model of acoustic frequency and modulation frequency,digitally and abstractly express the spectrum map after modulation transformation,and measure the characteristics of pathological voice signals from the perspective of energy distribution.The classification system of pathological voice disorder level was constructed,which confirmed the ability of modulation spectrum features to distinguish the degree of pathology.Compared with the features of Mel frequency cepstrum coefficient and multi-dimensional cepstrum combination,the recognition rate is improved.3.Combining signal processing with deep learning,making full use of the advantages of convolutional neural networks in image local feature extraction,an automatic evaluation system is designed to improve the classification accuracy of the level of voice disorder.After several trainings and repeated adjustments of the model,a set of good model parameters were determined.The experiment verifies the feasibility and effectiveness of the proposed combined system of log-mel spectrum and convolutional neural network,and improves the recognition rate of the classification system.4.PCA is used to reduce the dimensionality of the features extracted after the convolutional neural network,remove redundant information in the features,and retain the features with high expression ability.The experimental results show that,on the one hand,after the dimensionality reduction of the features,not only the computation of the model is reduced,but also the classification accuracy of the system is improved to a certain extent.On the other hand,the feature extraction of image information by convolutional neural network has certain component redundancy.
Keywords/Search Tags:pathological voice, modulation spectrum, convolutional neural network, PCA dimension reduction
PDF Full Text Request
Related items