Skip to content

Return error instead of panicking for corrupted aliases file on startup#8293

Merged
timvisee merged 3 commits into
qdrant:devfrom
leohenon:fix/aliases-json-panic-5004
Apr 10, 2026
Merged

Return error instead of panicking for corrupted aliases file on startup#8293
timvisee merged 3 commits into
qdrant:devfrom
leohenon:fix/aliases-json-panic-5004

Conversation

@leohenon

@leohenon leohenon commented Mar 5, 2026

Copy link
Copy Markdown
Contributor

Replace the startup panic/backtrace with a clean error message when the aliases file contains invalid JSON, by having TableOfContent::new return Result like Persistent::load_or_init.

Closes #5004

Copilot AI review requested due to automatic review settings March 5, 2026 00:47
@leohenon leohenon force-pushed the fix/aliases-json-panic-5004 branch from f16626d to be296b2 Compare March 5, 2026 00:50

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes Qdrant startup behavior to return a regular error (instead of panicking) when the persisted aliases JSON is corrupted/invalid, addressing #5004.

Changes:

  • Make TableOfContent::new fallible (Result) and propagate alias persistence load errors with a user-facing message.
  • Update Qdrant startup (main) and affected tests to handle the new Result return type.
  • Minor import reordering in snapshot API.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/main.rs Propagates TableOfContent::new failures via ? during startup.
src/consensus.rs Updates test code to unwrap the now-fallible ToC constructor.
src/actix/api/snapshot_api.rs Consolidates a couple of imports (no functional change).
lib/storage/tests/integration/alias_tests.rs Updates integration test to unwrap the new Result from ToC construction.
lib/storage/src/content_manager/toc/mod.rs Makes TableOfContent::new return Result and converts alias open failure from panic to Err.
Comments suppressed due to low confidence (1)

lib/storage/src/content_manager/toc/mod.rs:115

  • TableOfContent::new now returns Result, but it still uses expect(...) for filesystem operations (e.g. creating the collections/temp dirs). Those will still panic on startup (permissions, readonly FS, etc.), which is inconsistent with the new fallible constructor. Consider replacing these expects with error propagation (?) and mapping into StorageError so startup failures are reported as regular errors instead of panics.
        let collections_path = storage_config.storage_path.join(COLLECTIONS_DIR);
        fs::create_dir_all(&collections_path).expect("Can't create Collections directory");
        if let Some(path) = storage_config.temp_path.as_deref() {
            fs::create_dir_all(path).expect("Can't create temporary files directory");
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +209 to +214
let alias_persistence = AliasPersistence::open(&alias_path).map_err(|err| {
StorageError::service_error(format!(
"Failed to open aliases file at {}: {err}. \
Please verify the file contains valid JSON, \
then restart Qdrant.",
alias_path.display(),

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message says "aliases file at {alias_path}", but alias_path here is the aliases directory (.../aliases), while the JSON that fails to parse is .../aliases/data.json (see AliasPersistence::get_config_path). Pointing to the actual file path would make the startup error much more actionable.

Suggested change
let alias_persistence = AliasPersistence::open(&alias_path).map_err(|err| {
StorageError::service_error(format!(
"Failed to open aliases file at {}: {err}. \
Please verify the file contains valid JSON, \
then restart Qdrant.",
alias_path.display(),
let alias_config_path = AliasPersistence::get_config_path(&alias_path);
let alias_persistence = AliasPersistence::open(&alias_path).map_err(|err| {
StorageError::service_error(format!(
"Failed to open aliases file at {}: {err}. \
Please verify the file contains valid JSON, \
then restart Qdrant.",
alias_config_path.display(),

Copilot uses AI. Check for mistakes.
Comment on lines +208 to +216
let alias_path = storage_config.storage_path.join(ALIASES_PATH);
let alias_persistence = AliasPersistence::open(&alias_path)
.expect("Can't open database by the provided config");
let alias_persistence = AliasPersistence::open(&alias_path).map_err(|err| {
StorageError::service_error(format!(
"Failed to open aliases file at {}: {err}. \
Please verify the file contains valid JSON, \
then restart Qdrant.",
alias_path.display(),
))
})?;

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new behavior for corrupted aliases JSON (startup should error instead of panicking), but there is no regression test covering it. Adding an integration test that writes invalid JSON to aliases/data.json and asserts TableOfContent::new(...) returns Err (and does not panic) would help prevent this from reappearing.

Copilot uses AI. Check for mistakes.
coderabbitai[bot]

This comment was marked as resolved.

@leohenon leohenon force-pushed the fix/aliases-json-panic-5004 branch from be296b2 to 9c0bcfe Compare March 5, 2026 00:56
@leohenon leohenon force-pushed the fix/aliases-json-panic-5004 branch from 9c0bcfe to 0b6fa4c Compare April 10, 2026 03:36
coderabbitai[bot]

This comment was marked as resolved.

leohenon and others added 3 commits April 10, 2026 07:08
And drop FileStorageError in favor of std::io::Error, since we always
convert all kinds of errors into ServiceError anyway.
Also, drop context strings. We use fs_err anyway, that should be enough.
@xzfc xzfc force-pushed the fix/aliases-json-panic-5004 branch from 0b6fa4c to 2ec7b75 Compare April 10, 2026 07:17
@xzfc

xzfc commented Apr 10, 2026

Copy link
Copy Markdown
Member

Hi. Thanks for digging it.

I believe this should be solved on a different level. I.e., common::fs::ops::read_json (and other similar functions) should return affected file path in their errors; so all callers will be covered, not just this particular usage of aliases.json.

I've pushed an update to your branch implementing this.

@xzfc xzfc requested a review from timvisee April 10, 2026 07:25
@leohenon

Copy link
Copy Markdown
Contributor Author

Good call, much better to handle it at that level. Thanks for pushing the changes!

@qdrant qdrant deleted a comment from coderabbitai Bot Apr 10, 2026
@timvisee timvisee merged commit f5625fd into qdrant:dev Apr 10, 2026
15 checks passed
@leohenon leohenon deleted the fix/aliases-json-panic-5004 branch April 10, 2026 11:29
timvisee pushed a commit that referenced this pull request May 8, 2026
…up (#8293)

* Return error instead of panicking for corrupted aliases file on startup

* common::fs::ops: provide file name in error messages

And drop FileStorageError in favor of std::io::Error, since we always
convert all kinds of errors into ServiceError anyway.

* TableOfContent::new: return errors instead of panics

Also, drop context strings. We use fs_err anyway, that should be enough.

---------

Co-authored-by: leohenon <[email protected]>
Co-authored-by: xzfc <[email protected]>
@timvisee timvisee mentioned this pull request May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants