-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Use crc32 for sqlite page checksums #1582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The MacOS CI failure is unrelated to this change. |
|
It's worth discussing that this will break any potential for rolling back FDB6.2 to FDB6.1. Work was done on the TLogs to lay out the plan for how forward compatibility of on-disk state would work there. I'm not actually sure what our larger plans here are, as I don't think we've filed or discussed officially declaring rollback support and writing the tests for it. Rollback support for TLogs was largely done as something important to Snowflake, IIRC. |
|
@alexmiller-apple Regarding downgrades, at the very least going from FDB 6.2 to the only the latest FDB 6.1 could be made possible via a 6.1 patch release that includes the new checksum logic. |
|
I agree with @alexmiller-apple that not being to roll back is problematic... I think what @satherton suggests is the minimum. But even that is not great as it means that a rollback from 6.1.x to 6.1.(x-1) won't be possible for some value of However, 6.1 is not released yet? Could we patch FDB 6.1 so that it can read Crc32 but it only writes hashlittle2? If that is not possible anymore we imho should not write Crc32 in 6.2 and wait with that until 6.3. People who don't care about downgrades can then set a knob to enable Crc32 on 6.2 |
This resolves the sqlite component of issue #1358.
All new pages are written with a CRC32 checksum as the last 4 bytes. The prior 4 bytes are set to 0 to indicate that CRC32 was used. When verifying the checksum of a page, these 4 bytes are read. If the value is non-zero, the page must have been written with
hashlittle2, so this is the only checksum checked. If these bytes are 0, CRC32 was probably used, so this checksum is checked first. (1/2^32) of pages written withhashlittle2will also have 0 in these 4 bytes, sohashlittle2is computed if the CRC32 checksum does not match. This is the implementation proposed here.This change has been benchmarked on an insert-only workload where throughput increased by 11.7%.