| Data structure is a cornerstone of computer science.In reality,data structures support the operations of various computing devices;on the other hand,data structure is a basic course for computer science.Nowadays,the big data puts forward higher requirements for data structures.We not only want the data structure algorithms to be faster,but also hope the data structures consume less space.This thesis studies the data structure from two aspects.For the easier low-dimensional data structure problems,we consider succinct data structures,that is to say,to solve the problem with the space usage close to the information-theoretical lower bound.For the harder high-dimensional data structure problems,we study how to accelerate the oper-ations with parallel computing;we also try to prove lower bound for high-dimensional data structure problem to show that there does not exist efficient data structure algo-rithm,approaching the proof of“curse of dimensionality”.The thesis consists of two parts.The first part focuses on the succinct data structures for low-dimensional data structure problems.First,we prove lower bound for the range minimum query problem.Our lower bound is the first lower bound for the problem,and the lower bound shows that the state-of-the-art algorithm is optimal in the regime of constant time.We refine the state-of-the-art lower bound technique,so that we not only can prove the average case lower bound,but also can prove the higher worst-case lower bound.Then we consider the dynamic approximate set membership query(Bloom filter)problem.We consider a setting that is important in practice and is interesting but rarely considered by the theory community:the space usage of the dynamic data structure can be dynamically resized,and is required to depend only on the size of the current database,rather than the maximum size estimated in advance.On the one hand,it saves considerable space;on the other hand,it represents the important idea of“scalability”in practice.In this setting,we present the best data structure algorithm in all senses for the dynamic approximate set membership query:In terms of space,it is the only succinct data structure that solves this problem in the natural case of n?logu;in terms of time,all the update and query operations can be done in the worst-case constant time.Our data structure is a positive answer for an open problem left by Pagh,Segev and Wieder(FOCS 2013).Finally,we consider another fundamental data structure problem,the dictionary problem.We come up with a brand-new idea to deal with the“overflow”issue in the classic dictionary data structures which causes waste of space.With this strategy,in the commonplace case u=poly n,our extra space overhead is reduced from≈O(n(log n)2/3)bits to O(n log...log n)bits,while also ensuring that both update and query operations can be done in the worst-case constant time.Our data structure is a positive answer for an open problem left by Arbitman,Naor and Segev(FOCS 2010).The second part focuses on the nearest neighbor search problem,the central prob-lem of high-dimensional data structure problems.We first consider the randomized approximate nearest neighbor search in parallel,and prove the asymptotically optimal trade-off between parallelism and memory ac-cess in the polynomial space for this problem.In particular,we present efficient data structure algorithms and prove lower bound.The data structure upper bounds use the classical linear dimensionality reduction method,and the multi-way search.Our lower bound is proven by constructing a communication protocol with the data structure,then using the round elimination technique to prove the lower bound in the communication model.Then we try to prove lower bound for theλ-near neighbor problem with the state-of-the-art data structure lower bound technique.We first construct a communication protocol that characterize the cell-sampling technique,showing that we can apply cell-sampling technique in a simpler way with this protocol.Then we present a counter-example,a communication protocol,to show that it is not an easy task to prove lower bound for theλ-near neighbor problem with the state-of-the-art technique,the cell-sampling.Finally,we consider a computational model which can be thought of comprising all hashing-based algorithms.By applying the analysis of boolean functions in a simpler and straightforward way,we simplify the proof of the lower bound forλ-near neighbor search problem. |