| The voice is a medium for the exchange of ideas and the expression of emotions.Due to excessive use of the voice and bad living habits,the vocal cords are vulnerable to be damaged or changed pathologically,which affects phonation and people’s quality of life.Current research on pathological voice recognition mainly focuses on automatic recognition combined with acoustic analysis and machine learning.When the training set and the test set come from the same database,a higher recognition rate can often be achieved.However,when the training set and the test set come from different databases,due to the constraints of objective factors such as the speaker’s language,etiology,recording environment,etc.,the recognition results will drop significantly compared with those from the same database.Therefore,how to solve the mismatch between the source domain and the target domain has become a research hotspot in the field of pathological voice cross-database recognition.In this paper,from the research perspective of reducing the feature difference between the source domain and the target domain,a pathological voice cross-database recognition method based on manifold joint transfer is proposed.The features in the original space are transformed into the Grassmann manifold space to maintain the high-dimensional structure of the features,so that they have better geometric features and reduce feature distortion;use the ε-dragging method to reconstruct the label matrix of the regression model makes the model more suitable for classification tasks;at the same time,the maximum mean discrepancy is used to measure the overall difference between the source domain and the target domain and the local difference between different categories as the regular term of the least squares regression model,and uses graph embedding method to ensure the consistency of the label structure when measuring the local difference,so that the mapping matrix can minimize the feature difference between the source domain and the target domain when fitting the data and the given label.On the basis of this method,a pathological voice crossdatabase recognition system under manifold joint transfer is constructed,which minimizes the feature distribution difference between the source domain and the target domain,uses the source domain data to fine-tune the deep belief network,and recognizes and classifies the target domain data,to complete the cross-database recognition of pathological voices.The manifold joint transfer least squares regression algorithm proposed in this paper conducts multi-classification experiments in multiple groups of different pathological voice cross-database experimental settings.Compared with the unmapped features,the features from different databases mapped by the algorithm have increased by 8.55%,and the highest recognition rate have increased by 13.87%in the cross-database classification experiment.At the same time,the deep belief network based on model transfer is used as a classifier,and a manifold joint transfer least squares regression-deep belief network(MJTLSR-DBN)pathological voice cross-database recognition system is proposed.The system improved the average recognition rate by 6.4%in the cross-database recognition experiment of the selfbuilt pathological voice database. |