| In recent years,the Bidirectional Encoder Representations from Transformers(BERT)pre-training model,which uses representative bi-directional Transformer’s encoded representations,has shown excellent performance in natural language processing tasks,highlighting the powerful ability to mask language models.In the downstream tasks of natural language processing,text classification model is the most basic and essential automated processing tool.The full application of the knowledge obtained from the training of the BERT model to text classification tasks has always been a hot and difficult research topic for scholars.In this paper,three improvements have been made to the application of the BERT pre-training language model in text classification.The main research work is as follows:(1)To solve the problem of large amount of redundant and noisy information in long texts,a BERT text classification model based on convolutional neural network and long short-term memory network was proposed.The preprocessed textual data is trained using a pre-trained BERT model to obtain word embedding encodings,which are then encoded and extracted bidirectionally using a bidirectional long short-term memory network to obtain bidirectional semantic representations.The key information is extracted using a convolutional neural network.The outputs of the two neural networks are weighted using a TF-IDF gate mechanism,and finally,the semantic weight is enhanced through an attention mechanism.The experimental results show that compared with other classification models in the THUCNews and SN data sets,the BERT pre-training model combined with the two neural networks and gating mechanism has higher text classification performance.(2)To solve the problem of inconsistency between pre training tasks and downstream tasks in the pre training model,a new masking training task was constructed based on prompt learning and self-attention mechanism.The self-attention mechanism was used to calculate the weight relationship between the prompt learning template and the words in the text,and the masking strategy was changed from random masking to masking weighted words.This method can not only utilize the knowledge gained from pre-training tasks,but also specifically improve the performance of downstream tasks.(3)To solve the problem of the generation of adversarial samples only involves perturbations at the word embedding level,which lacks a corresponding basis in real-world text and raises issues related to interpretability.In this paper,a self-attention masking mechanism is used to generate adversarial samples by replacing masked words with words from the vocabulary whose Word2 Vec word vectors have cosine similarity scores within a threshold range.The experimental results show that compared with BERT and other variant pre-training models,the proposed model has better text classification performance on various data sets,which verifies the effectiveness of the constructed new masking task and the generation of adversarial samples.Figure[20] Table[21] Reference [82]... |