| With the development of modern information science and technology and the explosive growth of Internet users,the processing of massive data is becoming more and more important in the field of data processing,and artificial neural network(ANN)plays an extremely key role in the screening and processing of big data.Artificial neural network has successfully solved many difficult problems in the fields of computer vision,machine translation,automatic driving and so on.Therefore,artificial neural network has been more and more applied to the text classification in natural language processing(NLP).This direction is a hot and difficult point of natural language processing at present.Artificial neural network can not only process massive data quickly and efficiently,but also improve the accuracy of data processing to a certain extent.However,there are many differences between English and Chinese in character level and word level.Compared with English,the number of Chinese is greater than that of English in character level and word level,which leads to the problems of processing speed,accuracy and word segmentation in Chinese text classification technology.Textcnn is a text classification algorithm based on convolutional neural network,which is widely used and studied.Aiming at the poor feature extraction of textcnn model in text classification and the lack of long-distance attention to sequential text,this paper mainly does the following work.In view of the poor feature extraction and global information attention ability of textcnn model,this paper designs the text compression and incentive structure based on textcnn and the compression and incentive structure,and proposes a textcnn model based on the compression and incentive structure,which is called se-textcnn.Compared with the original textcnn,this model can enhance the semantic relationship,expand the perception field,and weight the feature channel to enhance the beneficial features.In this paper,the accuracy and F1 value are used as evaluation indexes to conduct experiments in thucnews data set.In the experimental part,the influence of batch size and other parameter variables on the classification effect of the model is explored,and the feasibility of the model is proved by multi model comparative experiments.From the experimental results,the accuracy of se-textcnn is 0.7 to 1.9 percentage points higher than that of bilstm,textcnn and other models,and the F1 value is 0.6 to 1.6 percentage points higher.In view of the lack of long-term dependence of the traditional word embedding model,based on the above model,this paper introduces the Chinese Bert pre training model,uses the text compression and incentive structure,and designs and implements the bert-scnn model.The above model has good feature extraction effect.Aiming at the data with strong semantic correlation output by Bert model and making full use of the advantages of Bert model,this paper uses one-dimensional convolution to extract the features of the data output by Bert model,and cancels the se operation before the convolution operation.Finally,in the experimental link,the comparative experiments of different parameter updating strategies and multiple improved models are explored.From the experimental results,the classification performance of bert-scnn on the test set is better than other improved models based on Bert.Finally,based on the se-text CNN and bert-scnn models proposed in this paper,a news text classification system is designed according to the hardware configuration of different servers. |