Font Size: a A A

Efficient recovery from repeated domain shifts in streaming data

Posted on:2017-04-27Degree:M.SType:Thesis
University:University of Maryland, Baltimore CountyCandidate:Gandhewar, RichaFull Text:PDF
GTID:2465390014474176Subject:Computer Science
Abstract/Summary:
Humans have a remarkable ability to learn how to learn, what to learn, and when to learn. We are able to assess the utility of learned knowledge to achieve an objective and adapt our learning strategies accordingly. Likewise, we want machine learning systems trained in one domain to adapt well to different domains. If a classifier system encounters a distribution which it has seen previously, it should remember the previously learned knowledge and classify accordingly. This thesis addresses the problem of recovering efficiently from repeated domain shifts in streaming data for a classifier system.;This problem can be divided into two sub-problems. The first sub-problem is detecting a domain shift in a data stream representing learned knowledge. Like (Dredze, Oates, & Piatko 2010), we also use the A-distance (Kifer, Ben-David, & Gehrke 2004) over the absolute value of classification margin of support vector machines for this task. The second sub-problem is deciding what action to take after a domain shift is detected. We propose and evaluate approaches to training new models and deciding when to reuse old models to minimize cost and maximize accuracy in the face of repeated domain shifts. We use the Amazon product reviews dataset for evaluating our algorithm.
Keywords/Search Tags:Repeated domain shifts, Learn
Related items