Research On Financial Text Generation Method Based On Knowledge Distillation And Pre-training Model

Posted on:2022-07-05

Degree:Master

Type:Thesis

Country:China

Candidate:T Z Chen

Full Text:PDF

GTID:2518306569997479

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In the financial field,analysts need to immediately write research reports when breaking news or financial data is released,such as macro research reports,event research reports.This task is tedious and urgent,and thus automatically generating research reports through financial news has become an important research topic.Through the previous analysis,this task has two major difficulties,i.e.,insufficient information contained in financial news and long text generation.Specifically,In order to solve the difficulty of insufficient information,this thesis designs a neural network model based on the teacher-student structure.From the perspective of paragraph level,The student model of knowledge distillation uses two variational auto-encoder models.When training the first variational auto-encoder,the input of the encoder is financial news and research reports,so that the hidden state distribution of the first variational auto-encoder can reconstruct research report.Then the hidden state distribution is regarded as the prior distribution of the second variational auto-encoder,so that the second variational encoder can reconstruct research reports even when only inputting financial news.From the perspective of word level,the BERT-based teacher model uses a conditional mask mechanism for pre-training.Through the knowledge distillation process,the probability distribution of the student model is used to approximate the output probability distribution of the teacher model at the word level,so that the student model can effectively learn the teacher model background knowledge.Through the above two strategies,the problem of insufficient information can be effectively compensated.In order to solve the difficulty of long text generation,this thesis proposes a variational auto-encoder model based on a pre-training model.The model uses BERT as the encoder of the variational auto-encoder model,and uses GPT-2 as the decoder of the variational auto-encoder model.The model in this thesis uses a pre-trained model to improve the coherence and readability of text generation.In terms of experimental data set construction,this thesis crawls mainstream financial portals,such as Sina Finance,Radish Investment Research,Oriental Fortune,etc.In terms of data processing,the financial news set is extracted according to the characteristics of the data set,and manual screening is performed one by one.In order to verify the performance of the model,this thesis carried out a large number of experimental verifications,including model ablation experiments,parameter sensitivity experiments and model performance comparisons.The evaluation indicators adopt the widely used DISTINCT and ROUGE.The experimental results show that the performance of the proposed method model is significantly better than the baseline model and the advanced long text generation model,which fully proves the effectiveness of the proposed method.

Keywords/Search Tags:

text generation, pre-training model, knowledge distillation, variational auto-encode

PDF Full Text Request

Related items

1	Research And Application Of Answer Generation Model Based On Conditional Variational Autoencoder
2	Research On Text Summarization Generation Technology Based On Topic Model And Variational Self Coding
3	Ontology-Based Dialogue State Tracking And Its Knowledge Distillation Method
4	Research On Method Of Emotional Text Generation In Human-machine Dialogue
5	Scene Text Recognition Based On Attention Mechanism And Knowledge Distillation
6	Research On Key Technologies Of Suggestion Mining And Generation
7	Research On Chinese Text Summarization Based On Deep Learning
8	Research And Application Of Text Quality Assurance Scheme In Video Conference Scene
9	Research On Neural Topic Modeling Method Based On Variational Auto-Encoder
10	Research On Stage-by-Stage Knowledge Distillation And Assistant Model Based Knowledge Distillation