Font Size: a A A

Research On Intelligent Mutimedia Information Generation Of Musics And Images

Posted on:2022-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2518306524493174Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Learning feature representation through the end-to-end deep convolutional neural network structure,and using control variables to achieve attribute control of the generated information is an important direction of information generation research.Although cascading control variables and input information or feature maps can achieve specific attribute control of the generated information,the difference in information amount and dimensional mismatch between the control variables and input information or feature maps will bring certain difficulties to attribute control.Therefore,this paper has conducted a specific analysis and research on the introduction of control variables,and the following work has been done:(1)This paper proposes a control information introduction method,which realizes the attribute control of the generated information by introducing control variables.The control information introduction method in this paper first uses a 6-layer fully connected layer network to perform nonlinear transformations on the control variables,so that the number of channels of the control variables is consistent with the number of characteristic channels extracted by the network,and the problem of dimensionality mismatch is solved.At the same time,through adaptive instance normalization,the mean and variance of the features and the control variables with the same number of channels are aligned on the channels,and the effective combination of control variables and features is realized.(2)This paper proposes an expression controllable image generation algorithm.Based on the control information introduction method proposed in this paper,the algorithm controls the expression attributes of anime-style face images through facial muscle action unit vector.At the same time,due to the lack of different expression images of the same face in the dataset,in order to avoid the loss of face identity information and other detailed information during using facial muscle action unit vector to control expression attribute,a reconstruction loss is introduced in the loss function.Finally,the experiments prove the effectiveness of controllig expression attribute through facial muscle action unit vector by the control information introduction method proposed in this paper.(3)This paper proposes a music generation algorithm based on active learning.The algorithm uses the control information introduction method proposed in this paper to control the timbre attributes of the generated music through instrument category vector.It is difficult to extract musical score features and timbre features from time series data,this paper uses short-time Fourier transform to obtain the magnitude spectrogram corresponding to the audio file,transforms the time series data into an image form,and then controls the timbre of generated magnitude spectrogram through the instrument category vector.At last,the magnitude spectrogram after timbre transformation is converted into an audio file through inverse short-time Fourier transform,which realizes the control of timbre attribute in the process of music generation.In addition,it is impossible to evaluate the music quality after timbre translation because of the lack of music audio files with the same score but different playing instruments.In order to solve this problem,this paper uses the Classical Piano MIDI database and the large-scale note database NSynth to establish a music audio dataset with the same score but different playing instruments.Finally,the experiments prove the effectiveness of controllig timbre attribute through instrument category vector by the control information introduction method proposed in this paper.
Keywords/Search Tags:Attribute Control, Expression Generation, Timbre Translation, Generative Adversarial Networks, Active Learning
PDF Full Text Request
Related items