Font Size: a A A

Research And Implementation Of Key Technologies For Automatic Transcription Of Multi-Instrument Music

Posted on:2020-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:C ZengFull Text:PDF
GTID:2428330575956531Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Music is a long-standing art form,the research of automatic music transcription has important application value for music information retrieval and other work.This research focuses on the automatic transcription of mixed audio sources of piano and violin music signal,it could be regarded as two sub-problems:separation of multi-instrument music and transcription of separated audio sources.In the separation study of the mixed source of piano and violin,the following works are included:Firstly,we sliced?mixed and numbered piano and violin sources to provide data support for the study of music source separation.Secondly,the performance limitations of frequency domain based music source separation are analyzed and the time domain based U-net model is used for our separation.Thirdly,dilated convolution is added to the convolutional layer in the U-net model based on the time domain,and the receptive field is expanded without increasing the convolution kernel parameters.Lastly,On the basis of U-net separation model and dilated convolution,different dilated factors are used to obtain a variety of different receptive fields,and the multi-scale information is combined to improve the segmentation accuracy.In the transcription experiment of single source,the following work is included:Firstly,frequency domain features for transcription are used,including STFT,CQT,and log-Merl frequency features.Secondly,The transcription model uses two classical neural networks:deep neural networks and convolutional neural networks and the experiment includes two instruments:the piano and the violin.Thirdly,for the deep neural networks,the difference between the effect of not adding Dropout and adding Dropout is compared.It is found that although the DNN network added Dropout has a slower convergence rate,the difference between the test set and the training set is smaller,and the transcription effect is also better than the DNN model that was not added Dropout.Lastly,For convolutional neural networks,the transcription effects between the model with fully connected layers and the model replacing the fully connected layers by convolutional layers were compared,and it was found that the transcription effect was better after replacing the fully connected layers with convolutional layers.
Keywords/Search Tags:Mixed Source, Source Separation, U-Net, Deep Neural Networks, Convolution Neural Network
PDF Full Text Request
Related items