Research On Improved Speaker Segmentation And Clustering Algorithm

Posted on:2020-12-01

Degree:Master

Type:Thesis

Country:China

Candidate:X H Gao

Full Text:PDF

GTID:2518306047978449

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

Speech segmentation and clustering of speakers is a technology to study the audio segmentation of a segment containing multiple voices,and to mark the speakers corresponding to each segment,so as to know "who speaks when".It is an important part of the practical application of speech signal processing technology.In recent years,more and more progress has been made in speech segmentation and clustering,but there are many problems,including:various noises in speech;The general speaking number is variable and there is no prior information.The accuracy of some algorithms needs to be improved.How to effectively solve the above problems is an important research direction and the main research content of this paper.Aiming at the defects of the two-threshold endpoint detection segmentation algorithm,self-organizing neural network clustering algorithm and k-means speech clustering algorithm,this paper improved it and applied it to the meeting recording,and achieved good results.Main work and innovations are as follows:Firstly,this paper summarizes the development status of speaker speech segmentation and clustering technology,expounds the basic steps,and describes the basic knowledge of preprocessing,feature extraction,speech segmentation and clustering.Secondly,in view of the deficiency of the traditional two-threshold endpoint detection segmentation algorithm,the improvement was made.The short-time average zero-crossing rate feature was replaced with a better spectral centroid feature,the median filtering was performed on the characteristic curve,and an algorithm was designed to select the threshold value through the local maximum value of the histogram of the statistical feature sequence.Experimental results show that this algorithm can enhance the anti-noise of the detection,improve the accuracy of the detection,and can detect multi-segment speech,more adaptive to the speaker speech segmentation technology requirements.in view of the traditional self-organizing neural network clustering algorithm and k-means clustering algorithm,based on self-organizing neural network improved k-means the speaker clustering algorithm,with the competition in the network layer neurons after training win situation to anticipation category number,and after training by network weights as the initial clustering center of k-means algorithm,and then to k-means the speaker clustering.Experimental results show that this method improves the accuracy of clustering,which not only makes up for the defect of slow convergence of the self-organizing neural network algorithm and fails to provide accurate clustering information,but also makes up for the defect that the k-means algorithm needs to give the number of clustering in advance and is greatly influenced by the selection of initial clustering center.Finally,the work of this paper is summarized and the direction of further research is pointed out.

Keywords/Search Tags:

Speech Segmentation, Speaker Clustering, Endpoint Detection, K-means, Self-organizing Neural Networks

PDF Full Text Request

Related items

1	Speaker Segmentation For Mixed Speech In Multi-person Conversations
2	Speaker Recognition Based On Continuous Hidden Markov Model
3	Design And Implementation On Text-Dependent Speaker Recognition System For Short Speech
4	Research On Improved K-means And Self-organizing Map Neural Networks And Their Applications
5	Research On Speaker Recognition In Conversational Speech
6	The Study Of Hierarchical Speaker Segmentation And Relative Algorithms
7	Research On Speaker Segmentation And Clustering
8	Research On Speaker-Independent Speech Recognition System Based On HMM
9	Analysis Of Speaker Roles For Multi-speaker Conversational Speech
10	Research Of Speech Endpoint Detection Based On Neural Network