| With the deepening of aging in China,people’s physiological functions continue to decline after entering old age,and the elderly’s language communication ability will also decline,resulting in problems such as speech ambiguity and word order disorder.At present,there are few studies on speech recognition technology for the elderly,and it also faces significant challenges in practical applications.This thesis focuses on speech recognition research for the elderly living alone.The main work is as follows:Aiming at the lack of voice data of the elderly,this thesis analyzes the voice characteristics of the elderly,and proposes a speech data enhancement method based on mixed penalty term,which solves the problems of slow convergence,instability and gradient disappearance of the original speech enhancement generative adversarial network model in the training data set of the elderly.The mixed penalty term is composed of regularization and mean square error.By minimizing the value of the mixed penalty term,the enhanced speech is closer to the clean speech and the quality of the enhanced speech is improved.Based on the limited speech data,this method generates more real speech data close to the elderly,broadens the data distribution of the training set,and enrichens the diversity of the data set.The trained model can better adapt to the test data set and solve the problem of the lack of speech data of the elderly.For the problem of low speech recognition performance and accuracy of the elderly,this thesis studies the improvement of deep feedforward sequence memory network for the elderly speech recognition.The network adds jump connections between adjacent memory modules to ensure that the high-level gradient of the network can be well transmitted to the lower layer,and compared with the traditional acoustic model,it can alleviate the problem of gradient disappearance.The open source tool Kaldi speech recognition system is used to train the acoustic model,and the performance of other different feedforward sequence memory networks on the speech data of the elderly is compared,and the performance of deep feedforward memory sequence neural network on different data sets is compared.The experimental results show that the recognition accuracy of the model on the speech data set of the elderly is better than other models.Acoustic models trained on a large number of elderly speech data can achieve better performance and improve the accuracy of elderly speech recognition. |