Academia.eduAcademia.edu

An Efficient Decentralized Multidimensional Data Index: A Proposal

2018, Proceedings of the 7th International Conference on Data Science, Technology and Applications

Abstract

The main objective of this work is the proposal of a decentralized data structure storing a large amount of data under the assumption that it is not possible or convenient to use a single workstation to host all data. The index is distributed over a computer network and the performance of the search, insert, delete operations are close to the traditional indices that use a single workstation. It is based on k-d trees and it is distributed across a network of "peers", where each one hosts a part of the tree and uses message passing for communication between peers. In particular, we propose a novel version of the k-nearest neighbour algorithm that starts the query in a randomly chosen peer and terminates the query as soon as possible. Preliminary experiments have demonstrated that in about 65% of cases it starts a query in a random peer that does not involve the peer containing the root of the tree and in the 98% of cases it terminates the query in a peer that does not contain the root of the tree. 2 RESEARCH IDEAS AND RESULTS This section introduces the problem description and our proposal to cope with it.