Font Size: a A A

Quality Analysis And Assessment For User Collaboratively Generated Content Based On Wikipedia

Posted on:2019-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:S Y ZhangFull Text:PDF
GTID:2348330545958470Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of the Web2.0 era,the role of the Internet users is changing from "consumers" to "producers".Consequently,many User-Generated Services(UGSs),that based on User-generated Content(UGC),are increasingly prevailing.One of the most special ones is Collaborative UGS(Co-UGS),such as Wikipedia,Baidupedia,Github,etc.Along with the success of these applications,there is always a doubt of their quality.Users are collaborated freely and organized weakly on these platforms.There is lack of strict control over the quality of the collaboration.So,"how to evaluate the quality of user-generated content" has become one of the problems to be solved.And,"how to build an automatic quality evaluation service" is also one of the hot topics in research area.In this thesis,we take one of the most successful and typical Co-UGS,English Wikipedia,as the research target.Firstly,we proposed a history-based hierarchical quality assessment model,which utilized a two-layer hierarchical LSTM to model the long edit history(sometimes could be more than 20,000)of an article.The lower layer captured the information within one year and took the revisions within one year as inputs;while the higher layer modeled the sequential information among years and took the outputs of the lower layer as inputs.By using the two-layer structure,this model could have access to historical messages in a larger scale,which increased the quality evaluation performance.Secondly,to better represent each revision,we proposed a "User2Vec" model,similar to "Word2Vec",which represented each editor by a distributed vector.In this way,user information was efficiently involved,and it provided a way to analyze the collaborative relationships among users.Experiments had demonstrated that our model significantly surpassed baselines.Lastly,we simplified this model to a one-layer "short-term" version which only took recent revisions,instead of the whole edit history,into consideration.This model had competitive performance and higher efficiency.Therefore,we developed an online quality assessment system based on this model,which supported both a website and a Chrome extension and provided user-friendly interface and usage.
Keywords/Search Tags:Wikipedia, User-Generated Content, Quality Assessment, Deep Learning, Recurrent Neural Networks
PDF Full Text Request
Related items