Font Size: a A A

Study On Application Of Spectral Map In Speaker Gender And Age Recognition

Posted on:2021-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:S B DaiFull Text:PDF
GTID:2518306110459244Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Speaker gender and age recognition is a sub-field of natural language processing technology and is a challenging task.In the human-computer interaction system,speaker gender and age feature recognition can provide personalized services for specific objects.With the in-depth study of human-computer interaction systems,user experience requirements are getting higher and higher.The technology is gradually being widely utilized in automatic voice query information,unmanned supermarkets,health care,entertainment and other fields,and is expected to continue to grow in the future.For speech signal feature extraction and modeling algorithms that are easily affected by environmental noise,gender and age recognition accuracy is low,age recognition is prone to missing information,and cannot fully represent speaker attribute information,this paper puts forward a research idea of combining spectral features with Denese Net network to identify the gender and age of speakers that are not related to text.The main research work is as follows:(1)A novel feature generation algorithm for spectrograms is proposed.The number of spectrograms generated is increased by dividing the large frames into small frames first,so that the spectrograms contain more comprehensive speaker information.And the algorithm extracts the background noise of the voice signal and the silent segment to generate a spectrogram,which is utilized as the system feature input map.After testing,the above algorithms have been effectively improved in terms of system anti-noise ability and accuracy;(2)A speech recognition algorithm model of gender and age is constructed based on the Denese Net network structure.By optimizing the matching of the excitation function and the classification function,the problems of overfitting and the disappearance of deep network gradients on small data sets are improved,and the effect of cross-layer depth convolution is strengthened,thereby reducing the number of feature maps.While saving system computing resources,it has improved the system recognition rate;(3)The influence rule of the spectrum chart frame length,learning rate,network structure parameters,iteration times,etc.on the system recognition rate is determined,and the optimal configuration of parameters is acheived.Based on the above research results,we completed the design of the online speaker recognition system.Based on python3.10 and Tensorfolw1.14.0 experimental platform,after using different speech libraries to test,the speaker's gender recognition rate reached 99%,and the age recognition rate reached 88.6%.The accuracy of simultaneous recognition of gender and age reaches 90%,and the recognition of gender and age of a single speaker can be completed within two seconds.
Keywords/Search Tags:speaker recognition, gender recognition, age recognition, spectrogram, DeneseNet
PDF Full Text Request
Related items