Font Size: a A A

Issues and methodologies in designing databases in distributed environments -- Privacy, quality, resources

Posted on:2008-02-26Degree:Ph.DType:Thesis
University:University of California, IrvineCandidate:Yu, XingboFull Text:PDF
GTID:2448390005963113Subject:Computer Science
Abstract/Summary:
Traditional data management systems store data at a central location, with well structured indices. However, the emergence and fast development of network technologies have forced data management into a new horizon. Data is not centrally stored any more, but generated and stored at distributed sources. Even more, data may not exist at all and can only be produced upon request. The data sources are situated in and are accessed through some network environments, such as Internet, cellular network, or wireless sensor networks. With the dynamic nature of most networks, precise data collection and query processing are hard to achieve. How to collect necessary data from the sources to answer user queries becomes a distinct challenge in data management.My contribution with this thesis is to address the issues with adaptive techniques. Adaptation is explored in two perspectives. Firstly, data quality is adapted to meet the final query requirements, which are approximate. We look into a broad range of quality definitions which include numerical precision, query response time, and user privacy needs. Secondly, the network resources are adapted as well in order to achieve querying goals such as long network lives and timeliness. The resources to be adapted include power sources of network nodes and the communication infrastructure and media.In this thesis, I first explore quality adaptation and present a middleware framework for data collection and query processing which allows flexible conversion between user quality requirements and the data collection precision. The framework is instantiated and validated with an application of mobile target tracking and a data archiving application. It is shown that this is sometimes a complicated task, especially when probabilistic measures are involved. I further investigate resource adaptation in data collection and query processing techniques which fully exploits the interaction and collaboration between active network nodes. One application studied is a fully in-network query processing technique which allows a full array of adaptations to be implemented. Another application takes into account the operation status of network nodes into consideration and solves the problem of state scheduling for energy-efficient fast data aggregation. With the ever-increasing importance of privacy preservation, I studied the technique on data privacy, winch can be viewed as a different quality measure. The investigated application domain is data collection in location based services and trajectory data anonymization. It is shown that maintaining desired user privacy while collecting appropriate data for efficient query processing is an very challenging issue.
Keywords/Search Tags:Data, Privacy, Query processing, Quality, Sources, User, Network
Related items