| Music generation means to using computers to create music through a certain algorithm or process with minimal manual intervention.It is an important research interest in the field of artificial intelligence.Music generation based on deep learning uses end-to-end learning,which reduces manual intervention and becomes the mainstream model in the field of music generation.The model uses convolution to extract the time dimension characteristics of note sequences.The model uses convolution to extract music features.On the one hand,in order to obtain long-time scale features,the model needs to use multilayer convolution and pooling,which can easily lead to the loss of local features in the pooling process.On the other hand,in the process of music generation,the model generates music cyclically by using previously generated music as an input,which can easily lead to errors accumulating and result in poor effect for long-term music generation.Based on these,this paper combines Convolution Neural Networks with Generative Adversarial Networks to study music generation.The main research contents and innovations of this paper are as follows:(1)An end-to-end music generation model,MIDI-GAN,is given by combining Wasserstein-GAN with Dilated Convolution Neural Network.The model takes Symbolic Music as the data source and introduces Dilated Convolution.MIDI-GAN obtains the longtime features through multi-layer Dilated convolution.The Dilated Convolution with sawtooth Dilated rate and residual connection are used to obtain the local features of music,so that MIDI-GAN can capture multi-scale melody features.In this paper,a model framework is given,and music generation experiments are carried out to verify that the model converges faster than RNN or GAN networks in the training process,and the music samples generated by the model are pleasant and smooth.(2)LM-VGAN model is proposed to solve the problem that music generation model is prone to error accumulation during the cycle generation process,which makes the long-term music generation ineffective.LM-VGAN introduces a long-term memory structure that allows the model to encode more music information by entering a longer music sequence and giving it different weights as input.Although long-term memory structure enhances the richness of model encoding information,it increases the complexity of input data.Therefore,a gated feature selection network is proposed,and a learning feature selection mechanism is constructed by convolution for the self-coding network of the model to filter out the more important features in music data.In addition,features consistent loss is introduced to improve the quality of the generation,and spectral normalization is applied to all convolution kernels of the discriminator to stabilize the training of the model.(3)LM-VGAN is experimented on Nottingham,a folk piano music data set,and a series of objective evaluation indexes and subjective evaluation scores are introduced to evaluate the generation effect of the model.The results of ablation experiment and comparative experiment show that compared with other models,LMV-GAN can reconstruct music more accurately,and has certain advantages in music generation tasks with different lengths.In addition,compared with the current mainstream music generation model,LM-VGAN has better effect on the objective evaluation indexes such as Scale Consistency,Uniqueness,Repetition and Tone Span;Finally,the human evaluation experiment was carried out through the scoring of music professionals,and LM-VGAN obtained a higher subjective score.These experiments verify the effectiveness of the proposed method and its potential in other sequence generation tasks in the future. |