| The speech of target speaker is interfered by the speech of other speakers.The performance of the speaker verification system degrades significantly when using the interfered speaker speech for speaker verification.Speaker verification for multispeaker speech includes two parts: extracting target speaker’s speech and verifying the speaker identity of the extracted speech.Based on the deep learning,the speaker extraction and verification methods for multi-speaker speech are researched in this paper.The main work is as follows:(1)An enrollment speaker model based on deep learning is proposed for speaker verification.The speaker model can be trained separately for each enrollment speaker.First,an identity feature extraction network is established to extract the deep embeddings from the reference speeches of each enrollment speaker.The mean of the deep embeddings of the speaker’s speeches is used as deep feature label of the speaker.Enrollment speaker network consists of several enrollment speaker models.The speaker model is selected from the enrollment speaker network according to the deep feature label,and this model is used to extract the speech of the target speaker from the mixture speech.Finally,the extracted speech is used as the input of the speaker verification network to verify whether it belongs to the target speaker.The experimental results show that the enrollment speaker model based on the deep learning is effective for speaker verification.(2)An enrollment speaker model based on the attention mechanism is proposed for speaker verification.The attention mechanism is utilized to further process the deep feature label of the target speaker and the mixture speech.It makes the speaker model pay more attention to the important speech segments in the mixture speech and ignore the useless speech segments of the mixture speech.The information of target speaker can be learned more effectively by the speaker model.The speech of target speaker is extracted by the method.Then,the speaker is confirmed using the extracted speech.The experimental results show that the enrollment speaker model based on the attention mechanism is effective for speaker verification. |