Font Size: a A A

Self-Tracking Instrumentation Agents and Analytics to Enable Knowledge Creation and Collaborative Intelligence from Analytical Workflow

Posted on:2018-05-19Degree:Ph.DType:Dissertation
University:North Carolina State UniversityCandidate:Jones, PaulFull Text:PDF
GTID:1478390020455916Subject:Computer Science
Abstract/Summary:
The 'self-tracking' (or 'quantified-self')movement is enabling people to gain increased self-awareness of aspects of their daily lives, such as health and fitness, and is enabling analytics developers to create useful services based on the collected data. This is achieved through a combination of sensors, algorithms and visualizations. More recently, it has become apparent that the self-tracking movement also has the potential to help knowledge workers gain self-awareness of their processes for analysis of complex information - individually and/or collaboratively - and to assist them with knowledge creation, collaborative intelligence, and other tasks.;This research is motivated by a study of how students and intelligence analysts work with their computers to find and manage information, the difficulties that they encounter, and the opportunities that aren't being fully explored as a result. We observe that often these groups work in teams to analyze information, and are often forced to regularly switch between multiple tasks. In such environments, the principle difficulties we discovered are in keeping track of information artifacts and sources during knowledge creation tasks, and especially doing so collaboratively. These can prevent knowledge workers from clearly reasoning about analytic conclusions, and from backing up those conclusions with detailed provenance.;We propose a novel method for mathematically modeling analytical workflows that links heterogeneous information sources with the tasks that they correspond to, and that combines information from collaborating users. Inspired by recent ideas in the deep learning and graph analytic communities, we propose a new abstraction of the idea of embedding entities in low dimensional spaces, which allows us to co-represent users, documents and tasks in a common vector space, and hence to automatically infer links between these different types of entities. While other applications of this method are possible, we focus on the challenge of inferring associations between documents (including files and URLs) and the individual and shared tasks to which they correspond - we refer to this as 'task-centric document curation'. Our method allows us to easily leverage a combination of content-based features and task-related features, as well as other characteristics of individual and team-based workflows, such as temporal patterns. We also show how our algorithm can operate in close to real-time (in order to facilitate timely user feedback) and can work well with few training examples. To demonstrate our method in practice, we introduce a 'recent-work dashboard' exemplar application that displays the inferred associations, as well as gathering feedback from users.;We evaluate our approach using multiple studies with groups of real-world knowledge workers. Results from empirical evaluation of several algorithm variants against three analytic workflow datasets (collected from students and intelligence analysts) are presented, along with an exploration and evaluation of ideas to further improve the classification accuracy that can be achieved from the vector space embeddings. One such method that proved successful is to incorporate coarse-grained temporal characteristics of tasks. We show how to leverage relationships between different entity types to increase classification accuracy by up to 20% over simpler baselines, and with as little as 10% labelled data from users.;To enable the development of these analytics, we first needed to capture data from longitudinal studies of analytic workflow, and to facilitate iterative prototype development. To achieve these aims, we address two further research challenges: the first of these is the challenge of creating self-tracking 'instrumentation' (or measurement) agents capable of capturing necessary features for task-centric document curation algorithms on a continuous basis. By learning from previous self-tracking prototypes that did not prove suitable for continuous use, we propose novel agents that are entirely passive (and hence non-disruptive to users), and we employ careful feature selection in order to create state-of-the-art desktop instrumentation agents. The second challenge is the creation of active 'journaling' interfaces to facilitate user provision of task labels for individual and collaborative tasks. This is tackled using a hybrid 'task-tree' interface consisting of a mixture of elements based on a fixed taxonomy and an adaptable 'folksonomy', combined with carefully controlled user prompting.;Finally, we present an alternative method for better enabling knowledge creation tasks by helping knowledge workers keep track of finer-grained information artifacts - this time in the context of 'claims' they are making in written reports, and the associated 'evidence' they might be using to justify their claims. We use a large corpus of screenshots collected from one of our instrumentation agents during a controlled analysis study to demonstrate a novel algorithm to automatically associate these two types of information artifacts - we refer to this as 'automatic provenance generation'.;Whilst this dissertation focusses on the challenges involved in gathering instrumentation and journaling data, and on the creation of instrumentation-enabled analytics (such as task-centric document curation), we also discuss the wider implications of such self-tracking, workplace-monitoring and smart digital assistant technologies. In particular, these technologies necessitate a careful balance of human factor considerations (such as privacy intrusion) with the future potential of such technologies to save knowledge worker's time and to improve the quality of their work.
Keywords/Search Tags:Self-tracking, Knowledge creation, Work, Instrumentation agents, Analytic, Intelligence, Tasks, Collaborative
Related items