| In the financial field,analysts need to immediately write research reports when breaking news or financial data is released,such as macro research reports,event research reports.This task is tedious and urgent,and thus automatically generating research reports through financial news has become an important research topic.Through the previous analysis,this task has two major difficulties,i.e.,insufficient information contained in financial news and long text generation.Specifically,In order to solve the difficulty of insufficient information,this thesis designs a neural network model based on the teacher-student structure.From the perspective of paragraph level,The student model of knowledge distillation uses two variational auto-encoder models.When training the first variational auto-encoder,the input of the encoder is financial news and research reports,so that the hidden state distribution of the first variational auto-encoder can reconstruct research report.Then the hidden state distribution is regarded as the prior distribution of the second variational auto-encoder,so that the second variational encoder can reconstruct research reports even when only inputting financial news.From the perspective of word level,the BERT-based teacher model uses a conditional mask mechanism for pre-training.Through the knowledge distillation process,the probability distribution of the student model is used to approximate the output probability distribution of the teacher model at the word level,so that the student model can effectively learn the teacher model background knowledge.Through the above two strategies,the problem of insufficient information can be effectively compensated.In order to solve the difficulty of long text generation,this thesis proposes a variational auto-encoder model based on a pre-training model.The model uses BERT as the encoder of the variational auto-encoder model,and uses GPT-2 as the decoder of the variational auto-encoder model.The model in this thesis uses a pre-trained model to improve the coherence and readability of text generation.In terms of experimental data set construction,this thesis crawls mainstream financial portals,such as Sina Finance,Radish Investment Research,Oriental Fortune,etc.In terms of data processing,the financial news set is extracted according to the characteristics of the data set,and manual screening is performed one by one.In order to verify the performance of the model,this thesis carried out a large number of experimental verifications,including model ablation experiments,parameter sensitivity experiments and model performance comparisons.The evaluation indicators adopt the widely used DISTINCT and ROUGE.The experimental results show that the performance of the proposed method model is significantly better than the baseline model and the advanced long text generation model,which fully proves the effectiveness of the proposed method. |