| With the development of Internet,a large amount of text data has been generated.In or-der to obtain effective information quickly,text classification technology needs to be further upgraded.With the deepening of deep learning theory research and the continuous improve-ment of computer hardware,using deep learning theory to classify text is more accurate and efficient than the previous methods.Therefore,this thesis uses paddle's encapsulation struc-ture to construct CNN network,and completes optimization and comparison experiments on this basis.Finally,through the analysis and verification of the experimental results,it is proved that the convolution neural network can indeed classify Chinese text,and the classi-fication effect is good,and the efficiency is high.The specific research contents are as follows:1.In this thesis,the news text database that has been processed in the AI studio platform provided by Baidu company is used to generate data dictionary to preprocess the data,and finally get the label data set and text data set.2.In the design of the network,the CNN network is constructed by using the encapsulation structure of paddle.CNN network includes convolution layer,pool layer and full connec-tion layer.In the convolution layer,convolution kernels of different numbers and sizes are designed to extract text features,and comparative experiments are carried out(In Experi-mental group 1,the size of convolution kernels is 3,4 the number of convolution kernels is 2;In Experimental group 2,the size of convolution kernels is 3,4,3,4 the number of con-volution kernels is 4;In Experimental group 3,the size of convolution kernels is 4,5 the number of convolution kernels is 2).In the pooling layer,the maximum pooling method is used to reduce dimensions and compress text information.In the full connection layer,the text information extracted from all convolution kernels is connected and output through the softmax layer.3.In the aspect of convolution neural network optimization:(1)Using 'tanh' activation function to realize nonlinearity(2)The cross entropy loss function is calculated by the forward network propagation method,and the back error propagation is carried out according to the loss function.(3)Using Adagrad optimization algorithm to further optimize CNN4.In the test network,the experimental results of CNN with different convolution kernel number and convolution kernel size are compared,and the experimental results based on accuracy and loss rate are analyzed.The experimental results show that the experimental parameters of Experimental group 1 have better accuracy and loss rate results.After that,input two groups of Chinese characters(belong to culture class and international class re-spectively)to test different convolution neural networks.Experimental group 1 has a good effect on the classification of culture texts,Experimental group 3 has a good efect on the classification of international texts,and Experimental group 2 has a bad effect on the classi-fication.In conclusion,convolutional neural network can classify Chinese documents with high accuracy and efficiency.But the experimental results of different convolution neural networks have their own advantages and disadvantages. |