This thesis presents a new architecture and optimizations to MapD, a database server which uses a... more This thesis presents a new architecture and optimizations to MapD, a database server which uses a hybrid of multi-CPU/multi-GPU architecture for query execution and analysis. We tackle the challenge of partitioning the data across multiple nodes with many CPUs and GPUs by means of an indexing framework. We implement a QuadTree spatial partitioning scheme and demonstrate how it improves the latencies of many queries when using the index as opposed to not using any. Moreover, we tackle the challenge of processing many queries (perhaps issued concurrently) where queries have very fast latency constraints, e.g, for visualization. We implement a software architecture which allows for scheduling concurrent client query requests to share processing of many queries in a single pass through the data ("shared scans"). Our experiments exhibit orders of magnitude improvement in query throughput for both, skewed and non-skewed workloads, for shared scans as opposed to serial execution....
This thesis presents a new architecture and optimizations to MapD, a database server which uses a... more This thesis presents a new architecture and optimizations to MapD, a database server which uses a hybrid of multi-CPU/multi-GPU architecture for query execution and analysis. We tackle the challenge of partitioning the data across multiple nodes with many CPUs and GPUs by means of an indexing framework. We implement a QuadTree spatial partitioning scheme and demonstrate how it improves the latencies of many queries when using the index as opposed to not using any. Moreover, we tackle the challenge of processing many queries (perhaps issued concurrently) where queries have very fast latency constraints, e.g, for visualization. We implement a software architecture which allows for scheduling concurrent client query requests to share processing of many queries in a single pass through the data ("shared scans"). Our experiments exhibit orders of magnitude improvement in query throughput for both, skewed and non-skewed workloads, for shared scans as opposed to serial execution....
Uploads
Papers by Saher Ahwal