Font Size: a A A

Research On The Visualization Method Of Significance Characteristics Of Non-sequential Text Classification Model

Posted on:2021-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:C Z ShenFull Text:PDF
GTID:2428330629452696Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In practical applications,both scientific researchers and users want to be able to understand the basis and process of model judgment,and trust the decision results of the model.After the model makes a wrong judgment,knowing the cause of the model judgment error will also be of great help to the further improvement and improvement of the model.While people continue to apply and innovate various models and solve one problem after another,the contradiction of lack of interpretability of machine learning models is becoming increasingly acute.In some areas that require higher models and accuracy,such as medical expert systems,iris identification,and the Industrial Internet,the lack of interpretability will pose a serious threat to the task.What's more,the model can be attacked by creating confrontation data To invalidate or make wrong decisions.Therefore,it is important to understand the decision basis of the model.In fact,the interpretation of deep learning is a difficult task.This end-to-end black-box model saves the learned knowledge in the parameter matrix of the network and applies it directly to the decision-making process.Therefore,people cannot accurately distinguish what knowledge is used in model decision-making.At present,people have not formed a unified definition of interpretability,and they lack a method for evaluating the performance of interpretation methods.With the development of deep learning technology,in order to meet the needs of more complex tasks or further improve performance,more and more complex models are proposed,and more and more complex technologies are applied in the model structure.Explanatory research creates even greater difficulties.Compared with the field of computer vision that focuses on perception,the same words in natural language may have different meanings in different contexts and situations.Combinations between different words may express different meanings,so models are needed Cognitive data.At this level,it is more difficult to extract salient features that determine model decisions in textual data.Research on the interpretability of deep learning models in the text field has been a difficult problem in this field.With the widespread application of attention mechanisms,the performance of deep learning models on various tasks has improved significantly.The attention mechanism mimics the way people perceive things.It calculates the range that the model should focus on and give high weight to the sample,while assigning lower weights to irrelevant parts.This method not only improves the performance of the model,but also brings new methods to the study of model interpretability.It has been a common practice to study the interpretability of models through saliency feature extraction methods based on the attention mechanism.However,correspondingly,the attention mechanism has irreparable shortcomings in the visualization of salient features.First,in the general case,the attention mechanism is applied to models that input text in the form of sequences,such as RNNs and their deformed LSTMs and GRUs,and Transformers that rely only on self-attention for feature extraction.Therefore,saliency feature visualization with attention mechanism can only be applied to these models.However,many text classification models commonly used in industry and academia,such as fastText,CNN,etc.,in the structure of the model,the text position information is lost,and the attention mechanism cannot or can only be limitedly applied.In this case,it cannot be effectively explained..Secondly,this method of interpreting the model by referring to the module still cannot explain the referenced module,namely the attention module.Finally,there may still be errors in the trained attention parameters.Correspondingly,errors may occur in significant interpretations.Based on the text classification task in the field of natural language processing,this paper proposes a new salient feature visualization method for fastText and CNN non-sequential models.At the same time,the attention weight parameter extraction was performed on the LSTM and Transformer models with the attention mechanism applied,and the visualization methods based on the attention mechanism were studied.In the end,the performance evaluation method of the interpretable method is designed,and the effectiveness evaluation of the above four different methods is performed.While proving the effectiveness of the proposed method,the defects of the extraction method based on attention mechanism and The reason for the model's misjudgment.
Keywords/Search Tags:AI Explanatory, Text Classification, Attention Mechanism, Software Engineering Technology, Natural Language Processing
PDF Full Text Request
Related items