Font Size: a A A

Research On Musical Style Transfer Based On Generative Adversarial Network

Posted on:2022-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:L R XieFull Text:PDF
GTID:2518306350491814Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In recent years,artificial intelligence has already changed from a professional term to a noun that is talked about by the majority of people,and the Generative Admittedly Network(GAN)is undoubtedly one of the most popular artificial intelligence technologies in recent years.Generative Adversarial Network makes artificial intelligence more intelligent,and can almost pretend to be human in the fields of image generation,style transfer and so on,and create works that look like real ones.Inspired by the image style transfer based on Generative Adversarial Network,this paper applies Generative Adversarial Network to music style transfer.Before the transfer of music style,the concept of music style is explained,which is divided into timbre,performance style and composition style.This paper will study the style transfer of timbre.Timbre is a quantity that is difficult to capture and cannot be quantified,and music data changes with time,which is the biggest difficulty of this study.The solution adopted in this paper is to convert the intangible into the tangible,obtain the audio spectrum,transfer the style from image to image,and then restore the image to audio.In time-frequency analysis,two very representative algorithms are selected,one is the short-time Fourier transform(STFT),the other is the constant Q transform(CQT),STFT is one of the most common time-frequency analysis methods,and CQT is considered to be particularly suitable for music data analysis algorithm.Since the effect of tone color style transfer cannot be compared quantitatively,this paper will analyze and compare the two methods from subjective feelings in the end.There are many variations of GAN,and this article selected a modified version of CycleGAN,which is also the most commonly used GAN variant for image style migration.This article replaces the Residual Network Structure(Res Net)in the CycleGAN converter with Densenet.Compared with the original CycleGAN,the effect of the modified experiment is better.The mathematical principle and derivation are not discussed in depth in this paper,but only illustrate this phenomenon.In terms of the selection of music data,this paper selects the solo music of violin and flute,with single timbre and distinct features.The purpose of this selection is to reduce the difficulty of the experiment and control variables,so as to facilitate the comparison before and after the experiment.Wave Net is the main tool used in spectrogram restoration.From the results,the effect of using CQT algorithm is slightly better than that of STFT,but there are still obvious flaws.The music synthesized after conversion is not so smooth and natural,and many disharmonious sounds can be obviously heard.However,from the perspective of the transfer of musical instrument timbre style,this experiment is quite successful.Although this experiment has no applied scene in reality,it is believed that the optimization of generating adversity-network model will eventually make people hard to distinguish between real and fake just like the transfer of image style.
Keywords/Search Tags:Style transfer, GAN, STFT, CQT, DenseNet
PDF Full Text Request
Related items