Snapshot storage on S3 via SnapshotManager abstraction#3430
Snapshot storage on S3 via SnapshotManager abstraction#3430mogery wants to merge 30 commits intoqdrant:devfrom
Conversation
|
alright, everything is working on local! i have confidence i can build this now. onto S3 |
|
I'm opening this as ready for review, but take it more as "request for comments". Everything seems to work after messing around a bit, and all the tests run, but I haven't tested yet for any accidental breaking API or backwards compatibility issues. Concerns/notes to self:
|
|
Hey @mogery, thanks for the PR! |
Hey, it's sort of hard to split, since it's an architectural change, and I can't really break it down without breaking build. I could break it down into two: one that just switches everything to use SnapshotManager, and another one which adds S3 support to it, but I'm not sure how much that would help. |
|
Gonna take a look at the failing tests soon and fix 'em up. |
|
sorry, sent that message right before realizing that i was running the tests in the wrong directory... anyways, i've fixed everything up now. all tests run for me on local both with the local backend and with S3. hopefully this time it all passes on your side too |
|
i've also verified that nothing is present in the local snapshot folder (at idle, temp files are still created now and then) when using S3 |
|
I'm very confused. All the integration tests run successfully on my local machine. Here's a log. Exit code is reliably 0 for me. EDIT: seems like i'm running the wrong test actually... |
|
Another round of fixes done. Ran even more tests on local, everything is coming back green. |
|
Heh, I didn't see that that one failed, because I couldn't see the exit code. Anyways, should be fixed. Thanks for bearing with me. |
|
I'll also be running the whole test suite while mounted to S3 tomorrow. |
|
hi there @generall, any updates on reviewing this? |
|
hey @mogery, review of that big pr requires about 4 hours uninterrupted timeframe. If possible, could you please split it into smaller pieces? |
|
@generall Closing this PR and splitting it into two. |
|
Superseded by #3520 and [TBD] |
Summary: This PR removes unneeded ARM NEON SIMD instructions for ScalarQuantizer. The removed instructions are completely redundant, and I believe that it is a funky way of converting two `float32x4_t` variables (which hold 4 float values in a single SIMD register) into a single `float32x4x2_t` variable (two SIMD registers packed together). Clang compiler is capable of eliminating these instructions. The only GCC that can eliminate these unneeded instructions is GCC 14, which was released very recently (Apr-May 2024). mdouze Pull Request resolved: facebookresearch/faiss#3430 Reviewed By: mlomeli1 Differential Revision: D57369849 Pulled By: mdouze fbshipit-source-id: 09d7cf16e113df3eb9ddbfa54d074b58b452ba7f

Fixes #3324
/claim #3324
This PR adds a new package,
snapshot_manager, which houses theSnapshotManagerstruct. TheSnapshotManagerabstracts (SnapshotFile) the filesystem-dependent parts of snapshots away from the rest of the codebase, which allows us to write two "backends" for snapshot storage: local and S3.The local backend is 99% ported over from the existing code. Most of the trouble is with adjusting the existing code to use this new interface.
The S3 backend is built with the
rust-s3crate, which I've found to be the most reliable. It's battle-tested by having a million all-time downloads, it supports non-AWS S3-compatible file storage services, etc.This is a biiiig PR which changes a considerable amount of code structure. Let me know if you like it or not. I'll be doing a larger sweep of all my changes to catch structure mistakes once the todo's finished.
To-do
Checklist
All Submissions:
devbranch. Did you create your branch fromdev?New Feature Submissions:
cargo +nightly fmt --allcommand prior to submission?cargo clippy --all --all-featurescommand?Changes to Core Features: