Font Size: a A A

Optimization Of Content Quality Evaluation Model And Construction Of Evaluation System

Posted on:2021-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:B C XuFull Text:PDF
GTID:2428330614970771Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The Internet has become a part of people's life.People can obtain more and more information from the Internet,and news platforms are also growing day by day.There are more and more "we media" and countless articles published,and the content quality of articles is uneven,which increases the difficulty of platform reviewers.At present,many companies judge articles based on indicators and ignore the content of articles themselves.Therefore,it is important for the long-term development of the platform to judge the quality only from the perspective of text.However,when Chinese text information is converted into digital information,it is difficult to extract information features in plain text.Therefore,how to identify the text quality of an article from the perspective of the text itself is the main problem to be solved in this paper.Based on the knowledge of natural language processing,machine learning and deep learning,the following two modules are designed and implemented.(1)identify the title party.In this paper,Bert(Bidirectional Encoder Representation Transformers)model is used to obtain sentence vectors and cosine similarity is used to calculate the similarity between sentences.The similarity model is used to extract the topic sentence,and then the similarity between the title and the topic sentence is calculated to realize the function of identifying the subject sentence.(2)evaluate the text quality of the text.Through supervised learning,the grammatical and semantic information features of the content of the article are extracted from multiple dimensions to minimize the loss of information.The theme distribution,phrase syntax structure,dependency and key words obtained by LDA(Latent Dirichlet Allocation)theme model were used as features respectively.Feature selection was conducted by chi-square test,supervised learning was conducted by machine learning algorithm,and multiple models were constructed.Model evaluation is carried out,and finally model fusion is made with meta-learning framework to improve the overall performance of the model.The built model can then be used to assess the quality of the text.After the test,Bert can be well applied to the calculation of sentence similarity,and the accuracy of the fused text quality assessment model is improved.The system has achieved the purpose of being available online.
Keywords/Search Tags:Clickbait, Text Content Quality, Similarity Algorithm, The Text Analysis
PDF Full Text Request
Related items