Font Size: a A A

Contextual Coherence Multi-track Music Generation Based On Generative Adversarial Network

Posted on:2022-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:H B ShiFull Text:PDF
GTID:2518306485959459Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Music is an essential element of entertainment,capable of arousing physical and emotional resonance.With the rapid development of mobile Internet,people's demand for music has increased dramatically.The traditional way of human creation has been unable to adapt to it.In the network platform,digital music as a medium can spread quickly,its market development prospects are broad.With the continuous development and development of deep learning technology,how to use deep learning technology to create music is becoming a hot topic for scholars.Researchers have been studying computer music generation since the last century.The generation adversarial network(GAN)is the most popular generation model in recent years,from the early real music fragments splicing to the traditional probabilistic generation method,and then to the deep learning-based technology to realize music generation.Introducing it into the field of music generation not only promotes the development of computer music generation,but also brings new problems and challenges.In this dissertation,an improvement scheme is proposed to solve the problems that the music samples generated by generative adversation network are independent of each other,have no correlation,and have too much noise in music samples.1.Based on the multi-track music generation model Muse GAN,a new multi-track music generation model,cyclic feature generation antagonistic network(RFGAN),is proposed in this dissertation.According to the characteristics of time correlation and repetition of music structure,a new time sequence model is constructed to strengthen the context correlation of music samples in terms of time series.According to the new time sequence mode,the generation model was improved,and the original unidirectional structure of the generation model was transformed into a cyclic structure.The feature information of the previous round of training was extracted by the feature extractor,and then the feature information was transferred to the next round of training after splicing with random noise.This also keeps the repetition of the music.In order to solve the problem of excessive noise in the music sample generated by the model,a scheme of adding average pooling layer at the end of the generation model is proposed.Compared with the original model,the improved model has a better performance of contextual relevance in music result evaluation.2.After a detailed analysis of the basic parameters of the data set,this dissertation focuses on the analysis of the music format of the mainstream music data set.According to the needs of model improvement,a multi-track music dataset in Pianoroll format was selected.In order to solve the problem of poor context cohesion of music samples,a special section preprocessing is carried out on the lakh-pianoroll music data set.According to the similarity of time dimension between music and speech,when segmented into sections,the data of the before and after music sections are partially overlapped,so that the music samples generated after model training have strong contextual relevance.3.Due to the particularity of music evaluation,this dissertation uses the combination of subjective evaluation and objective index evaluation to make a comparative analysis of the context relevance,music authenticity and melody of music samples generated by RFGAN and Muse GAN models.The results show that the RFGAN model is effective in terms of the context relevance of musical sentences.The data of objective evaluation are also improved compared with the original model.
Keywords/Search Tags:Contextual, RFGAN, Multi-track music generation, Generative Adversarial Network
PDF Full Text Request
Related items