Font Size: a A A

Research On Convolutional Neural Network And Its Usage In Pitch Detection With Noise

Posted on:2016-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q HuangFull Text:PDF
GTID:2308330461483095Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Convolutional neural network (CNN) is a deep learning method evolved from neural network (NN). CNN performs excellently in computer vision and is proven to be art-of-state model in image classification. CNN is indeed a kind of transformation of NN with partial connection and weight sharing which maintain CNN a deep hierarchy and greatly cut down its parameters, this feature makes CNN easier to converge and more general. We used back-propagate algorithm (BP), which is based on gradient descent, to train NN as well as CNN. In this paper, we summarized regulations about error transportation and weight update from formulas of gradient descent in NN training. We regard convolution procedure as training several parallel sub NNs that share weights. We regard sub sampling procedure as a specific convolution. With these assumptions, we transmit regulations of NN training into CNN training. We coded a CNN according to our regulations and used it in hand-writing digit recognition which is the earliest and most successful area where CNN is used. We’ve got right and expected result.Speech is a basic medium in human communication, it’s widely used in human-computer interaction as mobile internet spreads. Pitch is a very important parameter during speech signal processing. There are lots of methods to estimate pitch of voice, but most methods do not work well in speech with noise. In the recent years, researchers raised Jin and PEFAC as robust pitch estimation models to noise.In this paper, we analysis NN, CNN and traditional pitch estimation methods, then we combine CNN and ACF and get CNN_ACF_DP model. In our model, we produce probability information of pitch via ACF and CNN, then we use dynamic programming (DP) which descripts the short-time steadiness of speech to give pitch contour on merged feature of ACF and CNN.In our experiments, we use voice decision error (VDE) and detection rate (DR) to evaluate the performance of a pitch detection model. Through our experiment and comparison, we archive a good result obviously better than Jin and competitive to PEFAC. For random speakers and random noise, our method get a result slightly inferior to PEFAC with 1.34% absolute DR disadvantage and 2.3% disadvantage VDE, but for trained speakers, our method performs better than PEFAC with 0.8% absolute DR advantage and 9.2% advantage VDE.
Keywords/Search Tags:convolutional neural network, pitch estimation, speech signal processing, back-propagate algorithm
PDF Full Text Request
Related items