Font Size: a A A

Design And Implementation Of News Classification System Based On Knowledge Distillation

Posted on:2021-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:G HuFull Text:PDF
GTID:2518306575953619Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Faced with the chaotic and disorderly organized news texts on the Internet,the public sometimes needs to spend extra time to identify truly meaningful news information.Thus some companies choose to build internal news platforms,formulate standardized news classification standards,collect news information on the Internet,rearrange and classify them,and then provide them to their employees.However,if completely relying on manual classification of the collected news,it will add extra burden to the companies.This paper designs and implements a news classification system based on knowledge distillation,the system has three modules: classification model training module,news portal module and background management module.In the classification model training module,the deep learning text classification model used to predict news categories will be trained.Since the inference time of the classification model must meet the response speed requirements of the system,only small deep learning model with fewer parameters and shallow layers can be used.In order to make the classification model have higher classification accuracy,knowledge distillation technology is proposed to be used in the training of the classification model.In order to verify the effectiveness and universality of knowledge distillation technology in news text classification tasks,two models of Text CNN and Text RNN are selected as the student network,and the teacher network is selected as the BERT text classification model.Experimental results prove that after training with knowledge distillation technology,the classification accuracy of Text CNN and Text RNN are greatly improved,and the overall classification accuracy of Text RNN is higher than Text CNN,and the Text RNN with the highest accuracy is finally selected to integrate into the system.On the news portal page,users can browse various types of news information,and on the background management page,the administrator can efficiently manage news and other information.For the misclassified news,users can submit corresponding feedback,and the administrator can refer to these feedbacks to modify the news category and add the classified news to the news classification dataset.These mechanisms not only ensure the accuracy of news categories,but also make the dataset real-time and expandable.Retraining the classification model regularly on the data set can keep the classification model synchronized with the news.The system can help companies to classify news quickly and accurately,and can provide high-value news information for corporate employees.At the same time,the system can be used as an important tool for companies to promote the construction of corporate culture and collect employee interests,since it has high application value and promotion value.
Keywords/Search Tags:text classification, deep learning, knowledge distillation, classification system
PDF Full Text Request
Related items