Small Datum: Avoiding reads in write-optimized index structures

Monday, May 18, 2020

Avoiding reads in write-optimized index structures

This is a summary of features that avoid storage read IO in OSS write-optimized index structures. My scope is limited to MySQL and Tarantool -- I know there are similar features in other DBMS and am happy for comments on that:

RocksDB merge operator logically does read-modify-write but physically does a Put into the memtable and the read is deferred until user queries or compaction. I hope that MyRocks eventually takes advantage of the merge operator.
Read free replication in TokuDB and MyRocks.
Fast Updates in TokuDB do something similar to the merge operator for update and insert
Non-unique secondary index maintenance is read free in MyRocks
Deferred secondary index update in Tarantool is a new approach to avoiding reads for write-heavy workloads. It is worth reading. I wish it had been published by VLDB.
MyRocks also avoids some storage reads when validating unique constraints (for PK and unique secondary indexes) on inserts and some updates because the Get() that it does can use a bloom filter. This benefit is larger when there is a bloom filter on the max level of the LSM tree, but that has other costs is is frequently not done.
MyRocks uses the SingleDelete feature in RocksDB to allow tombstones to be removed earlier which decreases the CPU overhead for reads after a batch of writes.
A b-tree can also reduce index maintenance related storage read IO. See the InnoDB change buffer.

Updates

An incomplete list of relevant research:

DiffIndex - this is more about distributed systems, but still interesting
Techniques for reducing index maintenance overhead in an LSM by Luo & Carey (VLDB 2019)
Deferred Lightweight Indexing (DELI)

5 comments:

Chen LuoMay 18, 2020 at 3:04 PM
The algorithm described by Tarantool is the nearly identical as the previously published EDBT paper at https://tristartom.github.io/docs/edbt14.pdf. We also had a paper at VLDB 2019 that uses a similar idea but exploits a primary key index to cleanup secondary indexes more efficiently (http://www.vldb.org/pvldb/vol12/p531-luo.pdf).
ReplyDelete
Replies
Chen LuoMay 18, 2020 at 3:11 PM
Sorry that I posted the wrong link for the first paper. The correct one should be Deferred Lightweight Indexing for Log-Structured Key-Value Stores (https://ieeexplore.ieee.org/document/7152467)
ReplyDelete
Replies

Add comment

Monday, May 18, 2020

Avoiding reads in write-optimized index structures

5 comments:

Sysbench for MySQL 5.6 through 9.5 on a 2-socket, 24-core server