Research On Optimization Of Deep Learning Model For Acoustic Signal Processing

Posted on:2019-08-07

Degree:Master

Type:Thesis

Country:China

Candidate:J Lei

Full Text:PDF

GTID:2428330611493631

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As the main information carrier in the process of human production activities,acoustic signals have been receiving much attention and research.Entering the Internet of Things era,making the machine better serve the human society has become a hot topic.The human-computer interaction through acoustic signals has become a hot research topic.With the rapid development of computers and artificial intelligence,deep learning-based methods have become the mainstream research method for acoustic signal processing.The acoustic signals received by the machine are mainly from the human voice and the surrounding environment sound.At present,related research on acoustics mainly focuses on tasks such as automatic speech recognition,phoneme recognition and acoustic scene classification.This paper studies the acoustic scene classification and speech phoneme recognition tasks,and discusses some problems in the process of human-computer interaction between acoustic signals:Aiming at the problem of acoustic scene classification,this paper proposes a hybrid neural network model with highly aggregated time-frequency domain acoustic features.We observe that the existing model has the following problems in processing the audio time domain characteristics and frequency domain characteristics: 1)The single model structure only learns the time domain characteristics or frequency domain characteristics of the audio;2)The hybrid model structure loses or destroys the original timing information of the audio;3)The hybrid model structure does not utilize the audio time domain and frequency domain information,and cannot perform the optimal performance of the hybrid model.Based on the above observations and analysis,this paper designs an LCNN network structure to effectively avoid the loss of audio original timing information.In addition,this paper proposes a time-enhanced multi-channel feature fusion mechanism(MCFF)to use time-frequency domain characteristics more effectively for the hybrid model.Finally,based on the above two innovation mechanisms,a new hybrid model,Multi-LCNN model,is proposed.And the classification accuracy of acoustic scenes classification is improved by our hybrid model.Aiming at the problem of speech phoneme recognition,this paper proposes a multiobjective learning sequence-convolution neural network model(SeqCNN).According to the form of speech phoneme in the audio signal,the phoneme recognition model can be divided into the following three categories: 1)Frame-based model.The frame contains too little phoneme information,and the frame similarity near the connected phonemes is too high;2)the phoneme-based model.It relies on additional phoneme start and stop time information;3)a sequence-based model.It has insufficient learning ability for the phonemes of weak semantic information,and it cannot describe the start and end time of the phonemes.To solve these problems,we designed a sequence-convolution neural network structure that can process sequence data and use convolutional networks to learn the strong representation of phonemes.At the same time,we propose a multi-objective classifier with weight sharing and its loss function.Finally,we propose a sequence-convolution neural network model,SeqCNN model,which comprehensively solves the problem of speech phoneme recognition and improves the accuracy of phoneme recognition.

Keywords/Search Tags:

Deep Learning, Acoustic Signal Processing, Acoustic Scenes Classification, Phoneme Recognition

PDF Full Text Request

Related items

1	Acoustic Feature Learning With Deep Neural Networks For Phoneme Recognition
2	Research On Phone Feature Recognition Based On Deep Learning
3	Research On Acoustic Target Recognition Methods Based On Deep Learning
4	Urban Acoustic Classification Based On Transfer Learning Feature Of Deep Convolution Neural Network
5	Adaptive Algorithms And Post-processing For Acoustic Echo Cancellation
6	Research On Intelligent Recognition Method Of Underwater Acoustic Signal Modulation Type
7	Research On Blind Detection And Recognition Of Underwater Acoustic Communication Signals Based On Deep Learning
8	A Study On Acoustic Scene Classification By Ensembling Multiple Deep Models
9	Real-time Emotion And Phoneme Recognition Based On A Two-level Model
10	Research On Acoustic Scene Detection Based On Deep Learning