Font Size: a A A

Research On Improved Speaker Segmentation And Clustering Algorithm

Posted on:2020-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:X H GaoFull Text:PDF
GTID:2518306047978449Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Speech segmentation and clustering of speakers is a technology to study the audio segmentation of a segment containing multiple voices,and to mark the speakers corresponding to each segment,so as to know "who speaks when".It is an important part of the practical application of speech signal processing technology.In recent years,more and more progress has been made in speech segmentation and clustering,but there are many problems,including:various noises in speech;The general speaking number is variable and there is no prior information.The accuracy of some algorithms needs to be improved.How to effectively solve the above problems is an important research direction and the main research content of this paper.Aiming at the defects of the two-threshold endpoint detection segmentation algorithm,self-organizing neural network clustering algorithm and k-means speech clustering algorithm,this paper improved it and applied it to the meeting recording,and achieved good results.Main work and innovations are as follows:Firstly,this paper summarizes the development status of speaker speech segmentation and clustering technology,expounds the basic steps,and describes the basic knowledge of preprocessing,feature extraction,speech segmentation and clustering.Secondly,in view of the deficiency of the traditional two-threshold endpoint detection segmentation algorithm,the improvement was made.The short-time average zero-crossing rate feature was replaced with a better spectral centroid feature,the median filtering was performed on the characteristic curve,and an algorithm was designed to select the threshold value through the local maximum value of the histogram of the statistical feature sequence.Experimental results show that this algorithm can enhance the anti-noise of the detection,improve the accuracy of the detection,and can detect multi-segment speech,more adaptive to the speaker speech segmentation technology requirements.in view of the traditional self-organizing neural network clustering algorithm and k-means clustering algorithm,based on self-organizing neural network improved k-means the speaker clustering algorithm,with the competition in the network layer neurons after training win situation to anticipation category number,and after training by network weights as the initial clustering center of k-means algorithm,and then to k-means the speaker clustering.Experimental results show that this method improves the accuracy of clustering,which not only makes up for the defect of slow convergence of the self-organizing neural network algorithm and fails to provide accurate clustering information,but also makes up for the defect that the k-means algorithm needs to give the number of clustering in advance and is greatly influenced by the selection of initial clustering center.Finally,the work of this paper is summarized and the direction of further research is pointed out.
Keywords/Search Tags:Speech Segmentation, Speaker Clustering, Endpoint Detection, K-means, Self-organizing Neural Networks
PDF Full Text Request
Related items