Managing Quality, Identity and Adversaries in Public Discourse with Machine Learning

Posted on:2013-03-28

Degree:Ph.D

Type:Dissertation

University:Drexel University

Candidate:Brennan, Michael

Full Text:PDF

GTID:1458390008966941

Subject:Computer Science

Abstract/Summary:

Automation can mitigate issues when scaling and managing quality and identity in public discourse on the web. Discourse needs to be curated and filtered. Anonymous speech has to be supported while handling adversaries. Reliance on human curators or analysts does not scale and content can be missed. These scaling and management issues include the limits of crowdsourced comment rating systems, flaws in crowdsourced topic identification systems, and the identification or anonymization of large numbers of authors. Scaling rating and topic identification systems results in missing relevant quality discourse. Authorship recognition without automation is time consuming, costly, and does not scale. Using machine learning to automate authorship recognition gives rise to serious privacy and anonymity concerns, and deception in writing style can be difficult to detect. This work replicates comment ratings and topic identification of crowdsourcing systems that currently rely on human participation to be effective. This work also demonstrates the ability of machine learning to identify authors quickly and accurately, as well as methods to circumvent this identification. Finally, this work presents novel data sets as a foundation for future research in comment classification and adversarial authorship recognition.

Keywords/Search Tags:

Discourse, Quality, Authorship recognition, Machine

Related items

1	Research On Authentication Of Online Authorship Or Article
2	Research And Application Of Chinese Discourse Relation Recognition Method
3	The Effects Of Academic Paper Quality On The Network Positions Of Stars In Co-authorship Network
4	Research On Authorship Identification For Chinese Texts
5	Machine learning method for authorship attribution
6	Research On Micro Discourse Nuclearity And Relation Recognition In Chinese
7	Study On The Authorship Mining For Chinese E-mail Documents Based On SVM
8	Research On The Method Of Chinese Micro-Discourse Analysis
9	Research On Machine Translation Quality Estimation Methods Considering Discourse Relation Information
10	Study On The Authorship Identification For Chinese E-mail Documents Based On The Literary Style