-
Notifications
You must be signed in to change notification settings - Fork 741
Support configurable consistency levels #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It would probably be better if the consistency level was actually a URL parameter. |
If you haven't already seen it, Kyle Kingsbury's article on linearizability and stale reads in Raft systems is a good read on the consequences of relaxing read consistency in Raft-based systems. A single node's local opinion of whether or not it's the leader can and will be faulty during a partition, and check-then-act strategies will suffer from race conditions during partitions. The only way to ensure linearizability is to send queries through the Raft consensus process, where they will be totally ordered along with reads. Both Consul and etcd had issues with stale reads, and now both support strongly consistent queries via URL parameters. |
Interesting @codahale -- I did read that article many months ago, but didn't realise the implication that queries through the log might be required to be 100% sure. That said I did just add support for a Now that I think about it, there is probably still room for a race here. Very small, but still possible. |
Yup. VerifyLeader is not sufficient. You need to wait for a noop op to be committed by the raft state machine, or block until some other operation commits. |
(and since we found this issue experimentally in both etc and consul, I'm pretty confident you'll see it in rqlite as well) |
I'm sure I will. Thanks @aphyr I'll fix up |
Sure, but isn't any check-then-act strategy doomed to race conditions anyway, if the whole check-then-act operation is not atomic relative to the raft state machine? For purposes of discussion, it's useful to have a practical example of how to accomplish a given task while avoiding a race. A transaction is a practical means to achieve an atomic operation relative to the raft state machine. For example, it's possible to use a transaction to implement a compare-and-swap operation with rqlite. Simply begin the transaction with an operation that is guaranteed to fail in the event of a race, like inserting a row into a table, such that the insert is guaranteed to fail if a competing transaction is processed first. The transaction will roll back if the if this first insert fails, guaranteeing that we'll have either a successful atomic operation, or a rollback.
For a read operation, the result is potentially stale as soon as the data is received by the client. So, I don't see any practically utility in having readers block on the raft state machine, unless they get to hold a lock on the state machine until they close their connection. Obviously, write transactions like in the example I've given must block on the raft state machine. |
Agreed. Once it hits the client, it could always be out-of-date regardless. What I thought my |
Maybe so, but there's an extra level of consistency available if the reader gets to hold a lock on the raft state until it closes its connection. Does that sound interesting @aphyr? |
You don't have to hold a lock to provide linearizability. Please see http://www.ics.forth.gr/tech-reports/2013/2013.TR439_Survey_on_Consistency_Conditions.pdf. |
That's true. I was just thinking of practical uses that go beyond linearizability. I'm not so sure that read locks are really desirable anyway. |
rqlite now supports 3 different levels of read consistency -- none, soft, and hard. The first just goes to the local SQLite file, soft does a local leader check before reading the local SQLite file, and hard sends the query request through the raft consensus mechanism. I think this addresses this issue, let me know if I am mistaken. |
soft is the default. |
Actually, moved to "weak" and "strong". |
rqlite should support the following levels of consistency for reads:
These levels of consistency should chosen by a switch at the command-line.
The text was updated successfully, but these errors were encountered: