Academia.eduAcademia.edu

AS-index

2009, Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09

Abstract

AS-Index is a new index structure for exact string search in disk resident databases. It uses hashing, unlike known alternatives, whether baesd on trees or tries. It typically indexes every n-gram in the database, though non-dense indexing is possible. The hash function uses the algebraic signatures of n-grams. Use of hashing provides for constant index access time for arbitrarily long patterns, unlike other structures whose search cost is at best logarithmic. The storage overhead of AS-Index is basically 500 -600%, similar to that of alternatives or smaller.