Font Size: a A A

Musical Instrument Identification Based On Deep Learning And Timbre Analysis

Posted on:2019-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2405330548476160Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Musical instrument identification is a part of Music Information Retrieval(MIR)intended for automatic annotation,music classification and emotion identification.It is proved that timbre is time-frequency feature by means of analysis on time or frequency feature,cepstrum feature,sparse feature and probability feature.Efficient timbre representation is needed when identify instrument and methods of deep learning to extract high-level time-frequency information of timbre layer by layer is proposed.This thesis analysed traditional timbre feature and researched the application of deep learning on instrument identification along with extracting high-level representation of timbre.The main research works are listed as follows:1.Aiming at time or frequency feature,cepstrum feature,sparse feature and probability feature’s poor classification performance in kindred and percussion instrument,an enhanced model extracting time-frequency information and with lower redundancy was proposed.Firstly,a cochlea model was set to filter music signal,whose output was called Auditory Spectrum(AS)containing harmonic information and close to human perception.Secondaly,time-frequency feature was acquired by Multiscale Time-Frequency Modulation(MTFM).Then,dimension reduction was implied by Multilinear Principal Component Analysis(MPCA)to preserve the structure and intrinsic correlation.Finally,classification was conducted using Support Vector Machine(SVM).The experiment shows that MTFM’s average accuracy is92.7% on IOWA database and error rate of percussion or kindred instrument wins out features mentioned above.It is proved that the proposed model is an option for kindred and percussion instrument identification because accuracy of MPCA is higher than Principle Component Analysis(PCA).2.Aiming at traditional musical instrument identification methods’ reliance on elementary acoustical features and choice on features.A deep neural network extracting high-level time-frequency representation of timbre layer by layer was proposed,whose input was Auditory Spectrum containing harmonic information with lower redundancy and close to human perception.Combining both Deep Belief Nets(DBN)and Stacked Denoise Auto-encoder’s(SDA)merit,a mixed deep neural work was built with two modules above.The experiment shows that the proposed model’s accuracy is 97.22% on IOWA database.The error rate of percussion or kindred instrument wins out Mel-Frequency Cepstral Coefficients(MFCC)and spectrogram when they share the same neural network structure.3.Aiming at traditional musical instrument identification’s dependence on feature selection and elementary acoustical feature,a 5 layer Convolutional Neural Networks(CNNs)extracting high-level time-frequency information of timbre layer by layer was proposed,whose input was Auditory Spectrum containing harmonic information and close to human perception.The mono convolution kernel of first layer was improved by multi-scale kernel of time and frequency axis to effectively extract time-frequency information from AS.The experiment shows that the proposed model’s accuracy is 96.9% on IOWA database.The error rate of percussion or kindred instrument wins out MFCC and spectrogram.Accuracy of AS is improved by 9.1% and 3.54 than MFCC and spectrogram respectively.Multi-scale kernel’s accuracy is higher than mono kernel.The error rate of percussion or kindred instrument wins out Mel-Frequency Cepstral Coefficients(MFCC)and spectrogram when they share the same neural network structure.
Keywords/Search Tags:musical instrument identification, timbre analysis, Multiscale Time-FrequencyModulation(MTFM), mixed deep neural work, Convolutional Neural Networks(CNNs)
PDF Full Text Request
Related items