Research And Implementation Of Key Technologies For Automatic Transcription Of Multi-Instrument Music

Posted on:2020-10-31

Degree:Master

Type:Thesis

Country:China

Candidate:C Zeng

Full Text:PDF

GTID:2428330575956531

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Music is a long-standing art form,the research of automatic music transcription has important application value for music information retrieval and other work.This research focuses on the automatic transcription of mixed audio sources of piano and violin music signal,it could be regarded as two sub-problems:separation of multi-instrument music and transcription of separated audio sources.In the separation study of the mixed source of piano and violin,the following works are included:Firstly,we sliced?mixed and numbered piano and violin sources to provide data support for the study of music source separation.Secondly,the performance limitations of frequency domain based music source separation are analyzed and the time domain based U-net model is used for our separation.Thirdly,dilated convolution is added to the convolutional layer in the U-net model based on the time domain,and the receptive field is expanded without increasing the convolution kernel parameters.Lastly,On the basis of U-net separation model and dilated convolution,different dilated factors are used to obtain a variety of different receptive fields,and the multi-scale information is combined to improve the segmentation accuracy.In the transcription experiment of single source,the following work is included:Firstly,frequency domain features for transcription are used,including STFT,CQT,and log-Merl frequency features.Secondly,The transcription model uses two classical neural networks:deep neural networks and convolutional neural networks and the experiment includes two instruments:the piano and the violin.Thirdly,for the deep neural networks,the difference between the effect of not adding Dropout and adding Dropout is compared.It is found that although the DNN network added Dropout has a slower convergence rate,the difference between the test set and the training set is smaller,and the transcription effect is also better than the DNN model that was not added Dropout.Lastly,For convolutional neural networks,the transcription effects between the model with fully connected layers and the model replacing the fully connected layers by convolutional layers were compared,and it was found that the transcription effect was better after replacing the fully connected layers with convolutional layers.

Keywords/Search Tags:

Mixed Source, Source Separation, U-Net, Deep Neural Networks, Convolution Neural Network

PDF Full Text Request

Related items

1	Research On Separation Of Audio Signal Based On Deep Neural Network
2	Convolution Model Based Sound Source Separation Algorithm Research Under High Reverberation
3	Study On The Single Channel Source Separation Of Singing Voice Music
4	Underdetermined Speech Separation Based On Sparse Representation And Deep Learning
5	Algorithm Research Of Voice Singal's Blind Source Separation
6	Research On Blind Source Separation Of Nonlinear Mixed Signals
7	Research On Music Source Separation Algorithm Based On Deep Convolutional Neural Network And Its Application
8	Research On Image Big Data Classification Based On MapReduce And Convolution Neural Network
9	Study On Blind Source Separation With Dynamically Changing Source Number
10	Research On Blind Source Separation Algorithm Of Convolutive Mixture