Return error instead of panicking for corrupted aliases file on startup#8293
Conversation
f16626d to
be296b2
Compare
There was a problem hiding this comment.
Pull request overview
This PR changes Qdrant startup behavior to return a regular error (instead of panicking) when the persisted aliases JSON is corrupted/invalid, addressing #5004.
Changes:
- Make
TableOfContent::newfallible (Result) and propagate alias persistence load errors with a user-facing message. - Update Qdrant startup (
main) and affected tests to handle the newResultreturn type. - Minor import reordering in snapshot API.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/main.rs |
Propagates TableOfContent::new failures via ? during startup. |
src/consensus.rs |
Updates test code to unwrap the now-fallible ToC constructor. |
src/actix/api/snapshot_api.rs |
Consolidates a couple of imports (no functional change). |
lib/storage/tests/integration/alias_tests.rs |
Updates integration test to unwrap the new Result from ToC construction. |
lib/storage/src/content_manager/toc/mod.rs |
Makes TableOfContent::new return Result and converts alias open failure from panic to Err. |
Comments suppressed due to low confidence (1)
lib/storage/src/content_manager/toc/mod.rs:115
TableOfContent::newnow returnsResult, but it still usesexpect(...)for filesystem operations (e.g. creating the collections/temp dirs). Those will still panic on startup (permissions, readonly FS, etc.), which is inconsistent with the new fallible constructor. Consider replacing theseexpects with error propagation (?) and mapping intoStorageErrorso startup failures are reported as regular errors instead of panics.
let collections_path = storage_config.storage_path.join(COLLECTIONS_DIR);
fs::create_dir_all(&collections_path).expect("Can't create Collections directory");
if let Some(path) = storage_config.temp_path.as_deref() {
fs::create_dir_all(path).expect("Can't create temporary files directory");
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| let alias_persistence = AliasPersistence::open(&alias_path).map_err(|err| { | ||
| StorageError::service_error(format!( | ||
| "Failed to open aliases file at {}: {err}. \ | ||
| Please verify the file contains valid JSON, \ | ||
| then restart Qdrant.", | ||
| alias_path.display(), |
There was a problem hiding this comment.
The error message says "aliases file at {alias_path}", but alias_path here is the aliases directory (.../aliases), while the JSON that fails to parse is .../aliases/data.json (see AliasPersistence::get_config_path). Pointing to the actual file path would make the startup error much more actionable.
| let alias_persistence = AliasPersistence::open(&alias_path).map_err(|err| { | |
| StorageError::service_error(format!( | |
| "Failed to open aliases file at {}: {err}. \ | |
| Please verify the file contains valid JSON, \ | |
| then restart Qdrant.", | |
| alias_path.display(), | |
| let alias_config_path = AliasPersistence::get_config_path(&alias_path); | |
| let alias_persistence = AliasPersistence::open(&alias_path).map_err(|err| { | |
| StorageError::service_error(format!( | |
| "Failed to open aliases file at {}: {err}. \ | |
| Please verify the file contains valid JSON, \ | |
| then restart Qdrant.", | |
| alias_config_path.display(), |
| let alias_path = storage_config.storage_path.join(ALIASES_PATH); | ||
| let alias_persistence = AliasPersistence::open(&alias_path) | ||
| .expect("Can't open database by the provided config"); | ||
| let alias_persistence = AliasPersistence::open(&alias_path).map_err(|err| { | ||
| StorageError::service_error(format!( | ||
| "Failed to open aliases file at {}: {err}. \ | ||
| Please verify the file contains valid JSON, \ | ||
| then restart Qdrant.", | ||
| alias_path.display(), | ||
| )) | ||
| })?; |
There was a problem hiding this comment.
This change introduces new behavior for corrupted aliases JSON (startup should error instead of panicking), but there is no regression test covering it. Adding an integration test that writes invalid JSON to aliases/data.json and asserts TableOfContent::new(...) returns Err (and does not panic) would help prevent this from reappearing.
be296b2 to
9c0bcfe
Compare
9c0bcfe to
0b6fa4c
Compare
And drop FileStorageError in favor of std::io::Error, since we always convert all kinds of errors into ServiceError anyway.
Also, drop context strings. We use fs_err anyway, that should be enough.
0b6fa4c to
2ec7b75
Compare
|
Hi. Thanks for digging it. I believe this should be solved on a different level. I.e., I've pushed an update to your branch implementing this. |
|
Good call, much better to handle it at that level. Thanks for pushing the changes! |
…up (#8293) * Return error instead of panicking for corrupted aliases file on startup * common::fs::ops: provide file name in error messages And drop FileStorageError in favor of std::io::Error, since we always convert all kinds of errors into ServiceError anyway. * TableOfContent::new: return errors instead of panics Also, drop context strings. We use fs_err anyway, that should be enough. --------- Co-authored-by: leohenon <[email protected]> Co-authored-by: xzfc <[email protected]>
Replace the startup panic/backtrace with a clean error message when the aliases file contains invalid JSON, by having
TableOfContent::newreturnResultlikePersistent::load_or_init.Closes #5004