| Development of information technology generated a lot of data,big data has increasingly become the asset of many companies and governments to make decisions.Some of these data have not been well utilized.Text classification is a key technique in natural language processing(NLP),which has important guiding significance for data processing.The current text classification mainly uses deep learning technology,but the neural networks often overfit in the process of training.Therefore,this paper proposes a method based on adversarial training and BEDA data augmentation to mitigate the phenomenon of overfitting and improve the generalization of the model.The main work of this article include:First,for the overfitting of the convolutional neural network,after studying the principle of the word embedding,the adversarial perturbation is introduced.After the calculation of perturbation,we designed a word embedding disturbed to add interference to the input data.The word embedding layer which we designed is combined with the convolutional neural network(CNN)to propose a classification model based on adversarial training.The convolutional neural network is three-channel and the size of filters is different which can improve the ability of model to gain semantic information.Coupled with adversarial training,the model has a similar ability to regularize,and the robustness of the model has also been improved.To better reduce the overfitting of the classification model,after studying the back-translation method and EDA(easy data augmentation),we have utilized Chinese as an intermediate language and proposed a new back-translation method.Combined with the EDA,BEDA technology is proposed.Based on proposing a data augmentation classification model,the adversarial convolutional neural network proposed is combined and a convolutional neural network classification model based on data augmentation and adversarial training is proposed.This model further mitigates the overfitting of the model,which is a deeper level of the robust model.Finally,this paper tests the classic text classification datasets.We experiment with the text classification model,respectively.The results of the experiment show that the use of convolutional neural networks with adversarial training improved not only the classification accuracy of the model but also the robustness of model.In addition,the experiment of data augmentation shows that the proposed text classification model based on data augmentation can weaken the overfitting,after the addition of adversarial training,the performance of the model has been further improved.After comparing the original sentences and the sentences with BEDA,the semantic information did not change,which verify the reliability of our proposed method. |