Conversation
| pub durable_lsn: Option<Lsn>, | ||
| #[bilrost(11)] | ||
| pub last_archived_log_lsn: Option<Lsn>, | ||
| pub archived_lsn: Option<Lsn>, |
There was a problem hiding this comment.
I suspect those renames might cause deserialization errors for V1 protocol. @pcholakov did you validate backward compatibility?
There was a problem hiding this comment.
I did test a mixed cluster running main and my own changes (only after your fix for gossip went in), and yes, it does work. I believe we never serialize this struct using FlexBuffers or JSON, where this could be a problem. I'd like to do some more though, I've just confirmed that nodes form a cluster, and that all nodes start up partitions - and that all nodes can see the applied + durable LSNs from their peers.
There was a problem hiding this comment.
Let's Remove Serialize/Deseralize and see what breaks then.
There was a problem hiding this comment.
Silly me! This is used in NodeStateResponse, which is sent in response to Cluster Controller's GetNodeState. My testing was only using restatectl to talk to servers but a V1 admin node does not see these renamed attributes properly. Reverted!
| /// Last persisted log LSN | ||
| persisted_log_lsn: DataType::UInt64, | ||
| /// Durable log LSN | ||
| durable_log_lsn: DataType::UInt64, |
There was a problem hiding this comment.
This is a breaking change. This will change the sql schema
There was a problem hiding this comment.
Maybe check with @igalshilman what is the best approach to update this. Do we keep both names and mark the old name as deprecated, and keep it in code for couple of versions before removing ?
There was a problem hiding this comment.
@muhamadazmy we try not to do that, but we also have a versioning mechanism.
Both the CLI and the restate admin api, decelerate versions. CLI via the agent / server strings.
It happens here in the CLI https://github.com/restatedev/restate/blob/main/cli/src/clients/admin_client.rs#L150
AFIK @jackkleeman introduced this mechanism in both ends.
There was a problem hiding this comment.
I should have been more careful with this; I knew I am renaming the column in the view but switched it anyway. You're right that we should keep this as-is - reverted!
PS. I believe the versioning mechanism is purely about query data encoding, rather than the schema itself. At some point we might want to think about how we version the schema too, but I don't want to open that right now.
|
btw, i noticed there are still a few config flags related to the old persisted lsn watchdog? like the interval |
c0af645 to
f25d18d
Compare
@jackkleeman I've deprecated these in #3278 - one could argue we could just as well drop them, but I figured let's give folks a heads-up :-) |
f25d18d to
217d8bd
Compare
I'm not sure if this is necessary. I'd remove them from new config dumps and only keep them in shadow to give a notice. Additionally, can you send a PR to update the relevant documentation to align with these changes please? |
| /// Durable log LSN (previously, persisted log LSN) | ||
| persisted_log_lsn: DataType::UInt64, |
There was a problem hiding this comment.
Should we double-write to the new name so we can remove the old one eventually?
There was a problem hiding this comment.
Done, great suggestion!
muhamadazmy
left a comment
There was a problem hiding this comment.
Thank you @pcholakov for this PR. I agree with Ahmed that we probably should double-write the new name in the schema and update the docs of the older one as deprecated so we can drop it in the future. Other than that, it's good to go
99aecea to
f1b5d9d
Compare
Done! Thanks for the suggestion, I'll update the docs too 😃 |
Already done in #3278! |
This change renames "persisted LSN" to "durable LSN" to reflect partition data which has been durably committed to local disk by a node.
This is a breaking change for the following metrics but I feel this is justified; happy to undo if there are any concerns:
restate.partition.last_applied_lsn->restate.partition.applied_lsnrestate.partition.applied_lsn_lag->restate.partition.applied_lsn_lagrestate.partition.last_persisted_lsn->restate.partition.durable_lsnAdditionally:
restatectlcolumn display to "Durable LSN"NodeStateResponseRPC message (response toGetNodeState, used by Custer Controller) using the old naming as we still have a FlexBuffers-basedWireEncodefallback implementation for these; this will be safe to rename when we drop V1 networking supportdurable_log_lsncolumn to thepartition_stateview, mirroringpersisted_log_lsn(which will be separately documented as deprecated)Fixes: #3252