Font Size: a A A

A Pairwise Analysis Method For Fine-tuning Process Of Pre-training Model

Posted on:2022-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:J CaiFull Text:PDF
GTID:2518306485477214Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The interpretability of deep learning models has always been an important issue that cannot be ignored in the field of artificial intelligence.Since the advent of pre-training models,this issue has attracted widespread attention.At present,the development of this field is still in its infancy.Most existing methods only focus on the pre-training process of the analysis model and ignore the fine-tuning process of the pre-training model.As a key step in applying the pre-training mode to specific tasks,the fine-tuning process has important research value.To study the analytical methods that can be used to analyze the fine-tuning process of the pretraining model,and to understand the internal mechanism and principles of the model by analyzing the differences before and after the pre-training model fine-tuning.Further improve the application value of the model,and provide a new research perspective for future model optimization work.This article mainly conducts research from the following aspects:(1)Design different training methods to train and compare the pre-trained model and the fine-tuned model.By comparing the ability of the pre-trained model and the fine-tuned model to solve downstream tasks,it is found that the difference between the two is obvious,and the performance of the fine-tuned model in different tasks is overall better than the pre-training model.From this,it can be concluded that the fine-tuned model is different from the pretraining model,and the necessity of analyzing the fine-tuning process is proposed.(2)Perform feature mining according to different downstream tasks,and design feature algorithms to construct corresponding data sets for model analysis.The fine-tuning process usually needs to train the model in combination with the data set of the specific task.Therefore,the fine-tuning process of analyzing the pre-training model inevitably needs to be combined with the characteristics of different data sets.By combining two common cases in the downstream task data set,9 kinds of linguistic phenomena contained in each data set are analyzed,and corresponding algorithms are designed for each linguistic phenomenon to construct the corresponding data set.The created data set can be used for Analyze and compare the difference between the model before and after fine-tuning.(3)By summarizing and summarizing the characteristics of each data set,a brand-new paired analysis method that can be used to analyze the model is designed.The design process of this method combines the characteristics of the created data set on the one hand,that is,the keyword pairs that exist in each data set.By constructing the corresponding positive and negative examples and calculating the similarity between the positive and negative examples,the statistical method is used to visualize the similarity results.On the other hand,the design process of this method also combines the structural characteristics of the model.This method can be applied to the output results of each layer of the model,and the change trend of the corresponding capabilities of each layer of the model can be seen in a visual way.Through the analysis of the research results,it is found that the fine-tuning process has basically no effect on the first five layers of the pre-training model;the fine-tuning process is of little help to the general linguistic task,mainly improving the specific ability in the specific task,and the specific ability is mainly contained in the model.After 5 layers;the fine-tuned model has room for improvement in its ability to recognize complex linguistic phenomena;different layers of the fine-tuned model have different capabilities.The analysis of this method can help guide the application of the model and maximize the model's ability.
Keywords/Search Tags:Pre-training models, Fine-tuning, Linguistic phenomenon, Pairwise analysis, Interpretability
PDF Full Text Request
Related items