Font Size: a A A

A data collection system for rumor detectio

Posted on:2018-11-20Degree:M.SType:Thesis
University:University of DelawareCandidate:Wang, YeFull Text:PDF
GTID:2448390002498803Subject:Computer Engineering
Abstract/Summary:
Nowadays, a lot of unsubstantiated and unverified information, named rumors, are created and propagated through the Internet because of the easiness of posting information online and lack of supervision. These rumors may cause users' confusion and social unrest. To prevent the negative influences, rumor detection which employs machine learning has been well studied. And almost all of these machine learning based methods rely on a large rumor dataset, which makes a large collection of rumor related data highly desired. However, current rumor collection methods are partially manual and usually specific for a single platform.;In this thesis, we propose a rumor collection system to automatically collect rumor related data from both search engine and social media. It mainly consists of two parts. First, instead of using user input as the search query, a query generator is proposed to avoid directly using user input as the search query, which may result in the fail of search. It can generate a set of queries based on the user's input. After that, a novel rumor crawler is built to collect rumor related data by using the generated queries.;To validate our rumor collection system, experiments are taken on the Tweets from January 2016 to March 2017. The result of 50 different rumors shows that, compared with current widely used Twitter Search API, our system can crawl more rumor with an average increasement of 3.589 times. Furthermore, for some rumors, our system is still effective when Twitter Search API returns no results.
Keywords/Search Tags:Rumor, System, Data, Search
Related items