Index interpolation

Woong-Kee Loh; Sang-Wook Kim; Kyu-Young Whang

Index interpolation

Sang-wook Kim

2000, Proceedings of the ninth international conference on Information and knowledge management - CIKM '00

visibility

…

description

8 pages

link

1 file

In this paper, w epropose a subsequence matching algorithm that supports normalization transform in timeseries databases. Normalization transform enables nding sequences with similar uctuation patterns although they are not close to each other before the normalization transform. Application of the existing whole matching algorithm supporting normalization transform to the subsequence matching is feasible, but requires an index for ev ery possible length of the query sequence causing serious overhead on both storage space and update time. The proposed algorithm generates indexes only for a small number of di erent lengths of query sequences. F or subsequence matching it selects the most appropriate index among them. We can obtain better searc h performance by using more indexes. We c a l l o u r approach index interp olation. We formally pro ve t h a t the proposed algorithm does not cause false dismissal. F or performance evaluation, we h a ve conducted experiments using the indexes for only ve di erent lengths out of the lengths 256 512 of the query sequence. The results show that the proposed algorithm outperforms the sequential scan by up to 14.6 times on the average when the selectivity of the query is 10 ;5 .

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Wook-Shin Han

2007

Existing work on similar sequence matching has focused on either whole matching or range subsequence matching. In this paper, we present novel methods for ranked subsequence matching under time warping, which finds top-k subsequences most similar to a query sequence from data sequences. To the best of our knowledge, this is the first and most sophisticated subsequence matching solution mentioned in the literature. Specifically, we first provide a new notion of the minimum-distance matching-window pair (MDMWP) and formally define the mdmwp-distance, a lower bound between a data subsequence and a query sequence. The mdmwp-distance can be computed prior to accessing the actual subsequence. Based on the mdmwp-distance, we then develop a ranked subsequence matching algorithm to prune unnecessary subsequence accesses. Next, to reduce random disk I/Os and bad buffer utilization, we develop a method of deferred group subsequence retrieval. We then derive another lower bound, the window-group distance, that can be used to effectively prune unnecessary subsequence accesses during deferred group-subsequence retrieval. Through extensive experiments with many data sets, we showcase the superiority of the proposed methods.

Log In

Index interpolation

Sign up for access to the world's latest research

Related papers

Related papers

Related topics