In recent years,with the rapid development of Internet information technology,Chinese short text information on the network shows an exponential growth trend.These short text data information has many characteristics,such as few words,unclear upper and lower semantics,many differences and non-standard information content.To solve these problems,how to extract valuable information from a large number of short text information has become an urgent problem to be solved.Short text classification is a process that allows the computer to distinguish the text into a certain category initially determined according to the content of the text under a given classification model.It plays an extremely important role in text filtering,retrieval and index construction,which can make users solve problems more conveniently and quickly.With the increasing demand for short text classification,the research on short text classification technology and methods has more and more practical significance.In this paper,throughthe research of Bert model and support vector machine(SVM)model,in order to improve the accuracy of short text classification,a hybrid model of tf-bert-svm is proposed.The main research contents are as follows:(1)Constructing TF-BERT model to improve the feature extraction ability of BERT model.TF-IDF is used to weight the words after short text preprocessing,and the words after weighting are input into BERT model to get the word vector with weight information,and then the TF-BERT model is constructed.By comparing TF-BERT model with BERT,Random forest,K-Nearest Neighbor and Recurrent Neural Network,the Accuracy,Recall and FI of TFBERT model reach 92.4%,89.0%,respectively,91.3%are higher than other classification models,which proves the validity of TF-BERT model in short text classification.(2)Constructing TF-BERT-SVM model to further improve short text classification effect.In this paper,a combination of RBF kernel function and Sigmoid kernel function is proposed to optimize the parameters of SVM.The combination kernel function combines the advantages of a single kernel function and can effectively extract the local and global features of samples,SVM With combined kernel function is fused with TF-BERT model to form TF-BERT-SVM model.In order to verify the validity of TF-BERT-SVM model classification,this paper uses crawler technology to capture douban movie reviews,classifies movie reviews into positive and negative emotion,divides test set and training set to train and test the model.Finally,the Accuracy,Recall and F1 values of TF-BERT-SVM are 93.6%,91.5%and 91.8%higher than those of other classification models,which proves the superiority of TF-BERT-SVM. |