Font Size: a A A

Research On News Content Quality Detection Algorithms Fusing User Comments

Posted on:2020-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:L HuFull Text:PDF
GTID:2428330590473272Subject:Software engineering
Abstract/Summary:PDF Full Text Request
For the volume of news that are created by self-Media platform is huge,the quality of news is mixed,the cost of manual review is too high,and the efficiency is too low,the platform needs an automatic quality detection system.At present,there are several problems in the quality detection of news content: 1)research only on single quality type modeling,such as false detection,lack of multi-angle and multi-type detection of the news content;2)the prior-model of platform compares “tolerance”;3)the content-based news model has limitations that unrecognized certain low-quality types,such as plagiarism news;4)the new low-quality new content created by the media editors to circumvent the platform rules cannot be identified;In addition,it is difficult to define many low-quality types of news for all platform.5)The existing low-quality categories of news have cross-cutting problems.In response to the above problems,this paper proposes an open version of the low-quality detection algorithm for news content that incorpor ates user reviews.The research work of this paper has a significant effect on the platform design rules to block the dissemination of low-quality news,improve the recommendation effect,and identify low-level authors.The main research contents of this topic are as follows:(1)In the second chapter of this paper,based on question 1,news content quality detection is regarded as a multi-classification problem.We propose a priori low-quality detection QAM-C model based on news text content,which combines cyclic neural network and convolutional neural network to capture local semantic information and long-distance semantic information of sentences,and uses a hierarchical network structure to model long documents to capture dependencies between sentences.The proposed model is compared with the effects of popular machine learning and deep learning models.The experimental results show that our model is better than all baseline models,and it is also found that the news text features are not enough to meet the identification of some categories.(2)In the third chapter of this paper,we propose a posterior low-quality QAMCR model that combines user comments for questions 2 and 3.A sub-module is added to model user comments.The module uses the LDA top ic model to capture focused topics in multiple reviews,using memory networks in conjunction with attention mechanisms to select important comment features to filter for noise interference.The QAM-CR model is compared with the effect of the second chapter model.The experimental results show that the proposed model outweighs the baseline method and exceeds our prior model,which further proves the validity of the comment feature.(3)The fourth chapter of this paper deals with the quality of content detec tion into an open multi-label classification problem for questions 4 and 5.The transfer learning method is used to train the model,and the reject mechanism is introduced,The threshold is dynamically selected using statistical methods.Based on the above experimental results,we compare the effects of the open version of the QAM-CR model and it demonstrates the validity of the model.
Keywords/Search Tags:content quality, multi-label classification, machine learning, deep learning
PDF Full Text Request
Related items