Font Size: a A A

TextPioneer:Exploring Topical Lead-Lag Relationship Across Multiple Corpora

Posted on:2015-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:H WeiFull Text:PDF
GTID:2308330470467786Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In this paper, we present a visualization tool to help users explore and analyse topic lead-lag relationships across multiple corpora via advanced data mining and text visualization techniques. Identifying which text corpus leads the others in the context of a topic presents a great challenge of considerable interest to researchers. Recent work on lead-lag analysis has mainly focused on estimating the overall leads and lags between two corpora. However, real-world applications have a dire need to understand lead-lag patterns both globally and locally.After reviewing much previous research work of data mining and visualization, we in-troduce TextPioneer, an interactive visual analytical tool for investigating lead-lag across corpora at multiple levels. In particular, we extend an existing lead-lag analysis approach to derive multi-granular results. To convey multiple perspectives of the results, we design two visualizations inspired by radial space-filling visualization and the double helix structure of DNA, respectively. Furthermore, we also enable smooth communication between the visual-ization and analysis modules, as well as among different visualizations that encode different aspects of the results. We have applied our work to several corpora and our evaluation shows the promise of this work, especially in support of text comparison at different levels of detail.
Keywords/Search Tags:Text Visualization, Lead-Lag Analysis, Interactive Visualization Tool
PDF Full Text Request
Related items