| It is found that Deep Neural Networks(DNNs)are vulnerable to adversarial example attacks,which indicates that text classification systems based on DNNs may have potential security threats.Attackers can use adversarial example generation methods to disguise malicious samples as normal samples,thus evading the detection of classifiers and seriously affecting the security of text classification systems such as sentiment analysis and spam classification.The current research on adversarial example generation and defense techniques for text classification is mainly focused on English,and the difference between languages makes the generation and defense techniques cannot be directly transferred to Chinese text.The research on adversarial example generation and defense techniques for Chinese text classification is useful for evaluating and explaining the model learning process,revealing model vulnerabilities and establishing security defense mechanisms in advance,thus promoting the further application and development of Chinese text classification systems.We research adversarial example generation and defense techniques for Chinese text classification to address the problem that text classification systems based on deep neural networks are vulnerable to adversarial example attacks.We propose an adversarial example generation method for Chinese text classification,Word Hit,which uses the morphological and phonological features of Chinese characters to build a candidate pool of morphological and homophonic characters,finds important words or phrases that affect classification by removing non-contributing clauses and calculating word importance scores,and designs a phonological modification strategy to generate adversarial examples to achieve a blackbox attack on Chinese text classification models.The effectiveness and generality of different classification tasks are verified using word-CNN models and Bi LSTM models,and it is demonstrated that the adversarial examples generated by this method can be effectively transferred to BERT models and practically deployed sentiment analysis systems.In view of the lack of adversarial example defense methods for Chinese text,an adversarial example defense method Word Revert for Chinese text classification is proposed.The method first obtains ”positive text” containing adversarial words by filtering clauses that do not contribute to the current classification label.Then the detection network is combined with a positional importance calculation function to achieve the detection of adversarial words.Finally,the adversarial words are reduced to the original words by calculating the candidate and detection scores.The experiments show that the method can effectively defend against the current popular Chinese text adversarial attack algorithms,achieve a significant improvement in the accuracy of the adversarial examples with a small reduction in the classification accuracy of clean examples,and obtain better accuracy,recall,and F1 values of adversarial word detection and reduction.The adversarial example attack and defense framework for Chinese text classification is initially implemented,which contributes to the establishment of a security defense mechanism for Chinese text classification models. |