Skip to content

nix nario list improvements#232

Merged
edolstra merged 8 commits intomainfrom
export-import
Oct 20, 2025
Merged

nix nario list improvements#232
edolstra merged 8 commits intomainfrom
export-import

Conversation

@edolstra
Copy link
Collaborator

@edolstra edolstra commented Oct 17, 2025

Motivation

  • Makes it much faster by seeking over NARs (from 7.42s to 0.42s on a cold page cache for a 15 GB system closure).
  • Adds a --json flag.
  • Shows NAR contents. With --json, it shows information in the same format as nix store ls --json (i.e. the NAR listing format).

Context

Summary by CodeRabbit

  • New Features

    • nario list: full JSON output with detailed per-path metadata and optional per-NAR contents.
  • Documentation

    • nario list docs updated with JSON usage and examples.
  • Bug Fixes

    • Safer input skipping with seek-when-possible and robust read-based fallback for non-seekable streams.
    • Dump command now errors when writing archive output to a terminal.
    • Error messages truncated to avoid excessively long tags.
  • Tests

    • Added JSON-based tests validating deriver, contents structure, and no-contents variant.
  • Other

    • Added ability to skip parsing/writing file contents when not needed.

@coderabbitai
Copy link

coderabbitai bot commented Oct 17, 2025

Walkthrough

Adds a public Source::skip(size_t) API with an FdSource override and isSeekable flag; enables JSON accumulation/output for nario list; prevents dumping NAR to a TTY via getNarSink(); adds skip-contents handling in archive and filesystem sinks; updates docs and tests to validate JSON/deriver behavior.

Changes

Cohort / File(s) Summary
Serialization interface & implementations
src/libutil/include/nix/util/serialise.hh, src/libutil/serialise.cc
Added public virtual Source::skip(size_t) with default read-discard implementation; added bool FdSource::isSeekable = true and implemented FdSource::skip(size_t) to drop buffered bytes, attempt lseek forward (mark ESPIPE as non-seekable), and fall back to read-based skipping. Minor local refactors (use of auto, std::min).
NARIO list command (JSON output)
src/nix/nario.cc
CmdNarioList now mixes in MixJSON; ListingStore optionally accumulates std::optional<nlohmann::json> json and listContents; added listNar(Source) and renderNarListing; per-path JSON built via info.toJSON(...) when JSON mode active; final JSON {version, paths} emitted.
Documentation
src/nix/nario-list.md
Example updated to dump.nario. Added --json usage with sample JSON structure showing per-path fields and top-level version.
Dump NAR to stdout guard
src/nix/dump-path.cc
Added internal getNarSink() that errors if stdout is a TTY and returns an FdSink otherwise; replaced direct getStandardOutput() uses in CmdDumpPath::run and CmdDumpPath2::run with getNarSink().
Archive parsing: skip contents & error truncation
src/libutil/archive.cc
When sink.skipContents is true, parseContents skips remaining bytes (including padding) and returns early. Error messages truncate overly long tags to 1024 chars.
Filesystem sink: skip-contents support
src/libutil/fs-sink.cc, src/libutil/include/nix/util/fs-sink.hh
Added bool skipContents = false; to CreateRegularFileSink; NullFileSystemObjectSink::createRegularFile sets crf.skipContents = true so callbacks run but actual content handling is skipped.
Tests
tests/functional/export.sh
Captures drvPath via nix path-info --deriver; expands assertions to validate JSON output: deriver matches, contents structure/type/size checks, and a no-contents JSON variant check.

Sequence Diagram(s)

sequenceDiagram
    participant User as CLI/User
    participant CmdNarioList
    participant ListingStore
    participant JSON as nlohmann::json

    Note over User,JSON #E8F5E9: nario list --json flow
    User->>CmdNarioList: invoke `nario list --json`
    CmdNarioList->>ListingStore: init optional json accumulator
    loop per path
        ListingStore->>ListingStore: build per-path JSON (info.toJSON)
        ListingStore->>ListingStore: attach contents if collected
        ListingStore->>ListingStore: store under paths[key]
    end
    ListingStore->>CmdNarioList: return accumulated json
    CmdNarioList->>JSON: wrap with { "version", "paths": ... }
    CmdNarioList->>User: print JSON output
Loading
sequenceDiagram
    participant Caller as Reader
    participant Source
    participant FdSource
    participant OS

    Note over Caller,OS #FFF3E0: Source::skip and FdSource override
    Caller->>Source: request skip(len)
    alt Source default
        Source->>Source: loop read into temp buffer until len consumed
    end
    opt FdSource override
        Caller->>FdSource: FdSource::skip(len)
        FdSource->>FdSource: drop from internal buffer (bufPosIn/bufPosOut)
        alt remaining > 0 and isSeekable
            FdSource->>OS: lseek forward by remaining len
            OS-->>FdSource: success / error (ESPIPE -> mark non-seekable)
        else
            FdSource->>Source: fall back to read-discard (Source::skip)
        end
    end
    FdSource->>Caller: return after skipping requested bytes
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I nibble bytes and gently skip,
I hop past buffers, take a little skip,
JSON petals on nario's tree,
Deriver found and shown to me.
Hooray for tidy hops and tidy code! 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "nix nario list improvements" is related to the changeset and conveys that the PR improves the nix nario list command. The changes do indeed introduce improvements to this command, including seek-based optimization (reducing runtime from 7.42s to 0.42s for large closures) and new JSON output support via a --json flag. However, the title is overly broad and generic—the word "improvements" does not specify the nature of the changes, so a teammate scanning history would not immediately understand whether the PR addresses performance, adds features, fixes bugs, or a combination thereof. Despite this generality, the title falls within the criteria of being "partially related" to the changeset in that it refers to real and meaningful changes being made.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch export-import

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a582f17 and 107a93e.

📒 Files selected for processing (1)
  • src/libutil/serialise.cc (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/libutil/serialise.cc (1)
src/libutil/include/nix/util/serialise.hh (3)
  • buf (330-339)
  • len (101-101)
  • len (206-206)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: build_x86_64-linux / build
  • GitHub Check: eval
🔇 Additional comments (1)
src/libutil/serialise.cc (1)

113-121: LGTM: Base skip implementation is correct.

The fallback implementation correctly reads and discards data in chunks, and the read() call ensures that byte counters are properly maintained in derived classes.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Oct 17, 2025

@github-actions github-actions bot temporarily deployed to pull request October 17, 2025 12:26 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
src/nix/nario.cc (1)

160-164: Remove stray semicolon.

Superfluous ; after toJSON is harmless but noisy.

-                    auto obj = info.toJSON(*this, true, HashFormat::SRI);
-                    ;
+                    auto obj = info.toJSON(*this, true, HashFormat::SRI);
tests/functional/export.sh (1)

52-54: Add a pipe-based list test (non‑seekable stdin).

FdSource::skip should fall back when stdin is a pipe. Add a regression to cover cat file | nix nario list.

Would you like me to submit a patch that asserts cat "$TEST_ROOT/exp_all" | nix nario list --json succeeds?

src/libutil/include/nix/util/serialise.hh (1)

100-102: Public skip API looks good; ensure semantics documented.

Consider a brief comment: “Skips len bytes, using seek when available; otherwise reads and discards.”

Also applies to: 205-206

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7dc9416 and b3b9726.

📒 Files selected for processing (5)
  • src/libutil/include/nix/util/serialise.hh (2 hunks)
  • src/libutil/serialise.cc (4 hunks)
  • src/nix/nario-list.md (1 hunks)
  • src/nix/nario.cc (5 hunks)
  • tests/functional/export.sh (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/nix/nario.cc (1)
src/libstore/export-import.cc (2)
  • importPaths (72-156)
  • importPaths (72-72)
src/libutil/serialise.cc (1)
src/libutil/include/nix/util/serialise.hh (3)
  • buf (329-338)
  • len (101-101)
  • len (205-205)
🪛 Shellcheck (0.11.0)
tests/functional/export.sh

[warning] 65-65: Quote the right-hand side of = in [[ ]] to prevent glob matching.

(SC2053)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: build_x86_64-linux / build
🔇 Additional comments (2)
src/libutil/serialise.cc (1)

113-121: Base Source::skip implementation is fine.

Buffered discard loop is straightforward and safe.

src/nix/nario-list.md (1)

14-37: JSON example: verify fields and self-reference in references.

The sample shows the path referencing itself and fields like ultimate. Confirm this matches ValidPathInfo::toJSON output; otherwise adjust the example (e.g., drop self-reference, include only actual references).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
tests/functional/export.sh (2)

10-10: Fix jq selector for array output.

nix path-info --json returns an array, not an object keyed by path. The current selector will fail at runtime.

Apply this diff:

-drvPath=$(nix path-info --json "$outPath" | jq -r .\""$outPath"\".deriver)
+drvPath=$(nix path-info --json "$outPath" | jq -r '.[0].deriver // empty')

65-65: Quote command substitution and RHS variable.

Unquoted values in the comparison can cause glob matching and word-splitting issues.

Apply this diff:

-[[ $(nix nario list --json < "$TEST_ROOT/exp_all" | jq -r ".paths.\"$outPath\".deriver") = $drvPath ]]
+[[ "$(nix nario list --json < "$TEST_ROOT/exp_all" | jq -r ".paths.\"$outPath\".deriver")" = "$drvPath" ]]
src/nix/nario.cc (1)

165-165: Use 64-bit format specifier for narSize.

%d will truncate large NAR sizes. Use PRIu64 or %zu depending on the type of narSize.

Apply this diff and add the include:

+#include <inttypes.h>
-                    logger->cout(fmt("%s: %d bytes", printStorePath(info.path), info.narSize));
+                    logger->cout(fmt("%s: %" PRIu64 " bytes", printStorePath(info.path), (uint64_t)info.narSize));
🧹 Nitpick comments (1)
src/nix/nario.cc (1)

162-162: Remove extraneous semicolon.

There's an unnecessary semicolon after the toJSON() call that should be removed for cleaner code.

Apply this diff:

-                    auto obj = info.toJSON(*this, true, HashFormat::SRI);
-                    ;
+                    auto obj = info.toJSON(*this, true, HashFormat::SRI);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3b9726 and a70174c.

📒 Files selected for processing (5)
  • src/libutil/include/nix/util/serialise.hh (3 hunks)
  • src/libutil/serialise.cc (4 hunks)
  • src/nix/nario-list.md (1 hunks)
  • src/nix/nario.cc (5 hunks)
  • tests/functional/export.sh (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/nix/nario-list.md
  • src/libutil/include/nix/util/serialise.hh
  • src/libutil/serialise.cc
🧰 Additional context used
🧬 Code graph analysis (1)
src/nix/nario.cc (2)
src/libstore/export-import.cc (2)
  • importPaths (72-156)
  • importPaths (72-72)
src/libstore/include/nix/store/export-import.hh (1)
  • importPaths (17-17)
🪛 Shellcheck (0.11.0)
tests/functional/export.sh

[warning] 65-65: Quote the right-hand side of = in [[ ]] to prevent glob matching.

(SC2053)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: build_x86_64-linux / build
🔇 Additional comments (4)
src/nix/nario.cc (4)

9-9: LGTM!

JSON include is properly placed and necessary for the new JSON output functionality.


103-103: LGTM!

Adding MixJSON enables the --json flag as described in the PR objectives.


134-134: LGTM!

Using std::optional for the JSON accumulator is appropriate since it's only needed when --json is specified.


201-209: LGTM!

The JSON accumulation and output logic is well-structured. Initializing the accumulator when --json is set and emitting a versioned JSON object after processing is the correct approach.

@github-actions github-actions bot temporarily deployed to pull request October 17, 2025 14:00 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 17, 2025 14:26 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
tests/functional/export.sh (2)

10-10: Fix jq selector for deriver (past review comment remains unresolved).

This issue was already flagged in the prior review: nix path-info --json returns an array, not an object keyed by path. The jq selector will fail to extract the deriver value, leaving drvPath empty and causing the subsequent validation at line 65 to fail unpredictably.

Apply the fix from the prior review:

-drvPath=$(nix path-info --json "$outPath" | jq -r .\""$outPath"\".deriver)
+drvPath=$(nix path-info --json "$outPath" | jq -r '.[0].deriver // empty')

64-65: Quote RHS and command substitution (past review comment remains unresolved).

This issue was already flagged in the prior review. Shellcheck correctly warns that the RHS ($drvPath) must be quoted to prevent glob matching (SC2053), and the command substitution should also be quoted to prevent word-splitting.

Apply the fix from the prior review:

-[[ $(nix nario list --json < "$TEST_ROOT/exp_all" | jq -r ".paths.\"$outPath\".deriver") = $drvPath ]]
+[[ "$(nix nario list --json < "$TEST_ROOT/exp_all" | jq -r ".paths.\"$outPath\".deriver")" = "$drvPath" ]]
src/nix/nario.cc (1)

165-165: Use 64-bit format specifier for narSize.

The %d format will truncate large NAR sizes (> 2 GB). NAR size is uint64_t, so use PRIu64.

Apply this diff (as suggested in the past review):

+#include <inttypes.h>

And:

-                    logger->cout(fmt("%s: %d bytes", printStorePath(info.path), info.narSize));
+                    logger->cout(fmt("%s: %" PRIu64 " bytes",
+                        printStorePath(info.path), (uint64_t) info.narSize));
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a70174c and 5d6fcd9.

📒 Files selected for processing (6)
  • src/libutil/include/nix/util/serialise.hh (3 hunks)
  • src/libutil/serialise.cc (4 hunks)
  • src/nix/dump-path.cc (3 hunks)
  • src/nix/nario-list.md (1 hunks)
  • src/nix/nario.cc (5 hunks)
  • tests/functional/export.sh (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/nix/nario-list.md
🧰 Additional context used
🧬 Code graph analysis (3)
src/libutil/serialise.cc (1)
src/libutil/include/nix/util/serialise.hh (3)
  • buf (330-339)
  • len (101-101)
  • len (206-206)
src/nix/dump-path.cc (3)
src/libutil/include/nix/util/file-descriptor.hh (1)
  • getStandardOutput (127-127)
src/libutil/include/nix/util/serialise.hh (10)
  • FdSink (143-146)
  • FdSink (148-151)
  • FdSink (153-153)
  • FdSink (164-164)
  • sink (97-97)
  • sink (447-460)
  • sink (447-447)
  • sink (463-463)
  • sink (464-464)
  • sink (465-465)
src/libutil/serialise.cc (9)
  • FdSink (54-61)
  • sink (372-376)
  • sink (372-372)
  • sink (386-390)
  • sink (386-386)
  • sink (392-396)
  • sink (392-392)
  • sink (398-409)
  • sink (398-398)
src/nix/nario.cc (2)
src/libstore/export-import.cc (2)
  • importPaths (72-156)
  • importPaths (72-72)
src/libstore/include/nix/store/export-import.hh (1)
  • importPaths (17-17)
🪛 Shellcheck (0.11.0)
tests/functional/export.sh

[warning] 65-65: Quote the right-hand side of = in [[ ]] to prevent glob matching.

(SC2053)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build_x86_64-linux / manual
  • GitHub Check: build_x86_64-linux / test
  • GitHub Check: build_x86_64-linux / vm_tests_smoke
  • GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (8)
src/libutil/include/nix/util/serialise.hh (2)

100-101: LGTM! Clean API addition.

The virtual skip method on the Source interface enables efficient data skipping for seekable sources, which directly supports the PR's performance goal.


182-182: LGTM! Appropriate additions for seekable-FD optimization.

The isSeekable flag and skip override enable FdSource to fast-path seek operations when possible and gracefully fall back to read-based skipping for pipes or non-seekable descriptors.

Also applies to: 206-206

src/nix/dump-path.cc (2)

7-13: LGTM! Prevents NAR dump to TTY.

The helper correctly guards against writing binary NAR data to a terminal, improving user experience by catching misuse early with a clear error message.


31-31: LGTM! Consistent TTY guarding.

Both commands now correctly use getNarSink() to prevent binary output to terminals.

Also applies to: 62-62

src/libutil/serialise.cc (1)

113-121: LGTM! Solid default skip implementation.

The base Source::skip correctly reads and discards data in 64 KB chunks. The assertion guards against implementation bugs in derived classes.

src/nix/nario.cc (3)

9-9: LGTM! Clean JSON support integration.

The include and MixJSON inheritance provide the necessary plumbing for the --json flag.

Also applies to: 103-103


134-134: LGTM! Appropriate JSON accumulator.

The optional member cleanly tracks JSON-mode state and accumulates path data when active.


201-209: LGTM! Correct JSON output flow.

The initialization, accumulation, and final output of the JSON structure (with version and paths) is correctly implemented.

Copy link
Member

@cole-h cole-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

@github-actions github-actions bot temporarily deployed to pull request October 17, 2025 18:25 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
tests/functional/export.sh (3)

10-10: Fix jq selector for deriver extraction.

nix path-info --json returns an array, not an object keyed by path. The current jq selector will fail.

Apply this diff:

-drvPath=$(nix path-info --json "$outPath" | jq -r .\""$outPath"\".deriver)
+drvPath=$(nix path-info --json "$outPath" | jq -r '.[0].deriver // empty')

68-68: Quote the RHS in test comparison to prevent glob matching.

The unquoted $drvPath on the right-hand side allows glob matching, which can cause unexpected test failures.

Apply this diff:

-[[ $(printf "%s" "$json" | jq -r ".paths.\"$outPath\".deriver") = $drvPath ]]
+[[ $(printf "%s" "$json" | jq -r ".paths.\"$outPath\".deriver") = "$drvPath" ]]

74-74: Quote the RHS in test comparison to prevent glob matching.

The unquoted $drvPath on the right-hand side allows glob matching, which can cause unexpected test failures.

Apply this diff:

-[[ $(printf "%s" "$json" | jq -r ".paths.\"$outPath\".deriver") = $drvPath ]]
+[[ $(printf "%s" "$json" | jq -r ".paths.\"$outPath\".deriver") = "$drvPath" ]]
src/nix/nario.cc (1)

269-269: Use a 64-bit format specifier for narSize.

The %d format can truncate large sizes on 32-bit platforms or when narSize exceeds INT_MAX.

Apply this diff:

-                    logger->cout(fmt(ANSI_BOLD "%s:" ANSI_NORMAL " %d bytes", printStorePath(info.path), info.narSize));
+                    logger->cout(fmt(ANSI_BOLD "%s:" ANSI_NORMAL " %zu bytes", printStorePath(info.path), (size_t) info.narSize));
🧹 Nitpick comments (1)
src/nix/dump-path.cc (1)

7-13: Consider clarifying the error message.

The error message states "refusing to write NAR to standard output," but the actual condition being checked is whether stdout is a TTY. This could be misleading since writing to stdout via a pipe or redirection is perfectly valid.

Consider this adjustment:

-        throw UsageError("refusing to write NAR to standard output");
+        throw UsageError("refusing to write NAR to a terminal");
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d6fcd9 and 95ef2dd.

📒 Files selected for processing (9)
  • src/libutil/archive.cc (2 hunks)
  • src/libutil/fs-sink.cc (1 hunks)
  • src/libutil/include/nix/util/fs-sink.hh (1 hunks)
  • src/libutil/include/nix/util/serialise.hh (3 hunks)
  • src/libutil/serialise.cc (4 hunks)
  • src/nix/dump-path.cc (3 hunks)
  • src/nix/nario-list.md (1 hunks)
  • src/nix/nario.cc (5 hunks)
  • tests/functional/export.sh (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/libutil/serialise.cc
🧰 Additional context used
🧬 Code graph analysis (3)
src/libutil/archive.cc (2)
src/libutil/fs-sink.cc (1)
  • size (97-97)
src/libutil/include/nix/util/fs-sink.hh (1)
  • size (24-24)
src/nix/dump-path.cc (3)
src/libutil/include/nix/util/file-descriptor.hh (1)
  • getStandardOutput (127-127)
src/libutil/include/nix/util/serialise.hh (4)
  • FdSink (143-146)
  • FdSink (148-151)
  • FdSink (153-153)
  • FdSink (164-164)
src/libutil/serialise.cc (1)
  • FdSink (54-61)
src/nix/nario.cc (2)
src/libstore/export-import.cc (2)
  • importPaths (72-156)
  • importPaths (72-72)
src/libstore/include/nix/store/export-import.hh (1)
  • importPaths (17-17)
🪛 Shellcheck (0.11.0)
tests/functional/export.sh

[warning] 68-68: Quote the right-hand side of = in [[ ]] to prevent glob matching.

(SC2053)


[warning] 74-74: Quote the right-hand side of = in [[ ]] to prevent glob matching.

(SC2053)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: build_x86_64-linux / build
🔇 Additional comments (19)
src/nix/dump-path.cc (2)

31-31: Good refactoring that adds TTY protection.

The change centralizes sink creation and prevents accidental binary NAR output to terminals, improving user experience.


62-62: Consistent refactoring with line 31.

The change maintains consistency between both dump-path command implementations and adds the same TTY protection.

src/libutil/include/nix/util/fs-sink.hh (1)

17-17: LGTM! Safe default for the skip flag.

The skipContents flag with a false default maintains backward compatibility while enabling the skip optimization when needed.

src/libutil/fs-sink.cc (1)

199-199: LGTM! Appropriate optimization for NullFileSystemObjectSink.

Setting skipContents = true in the null sink is correct since it discards data anyway. This allows the parser to skip content efficiently via the skip API.

src/libutil/archive.cc (2)

135-138: LGTM! Correct skip logic with proper padding calculation.

The skip logic correctly accounts for both the content size and NAR padding (aligned to 8 bytes). This enables the performance improvement mentioned in the PR.


174-174: Good defensive improvement.

Truncating the tag to 1024 characters prevents excessively long error messages from malformed archives.

src/nix/nario-list.md (2)

8-8: LGTM! Clearer filename convention.

Using dump.nario makes the file extension explicit.


14-37: LGTM! Comprehensive JSON documentation.

The JSON example clearly shows the output structure including path info and contents.

tests/functional/export.sh (2)

53-54: LGTM! Basic listing tests are appropriate.

The tests verify both the basic listing output and the grep check for content.


67-75: LGTM! Comprehensive JSON validation.

The tests verify deriver, contents structure, file types, and sizes in both full and no-contents modes.

src/nix/nario.cc (8)

103-169: LGTM! Well-structured NAR parsing implementation.

The listNar function correctly builds a JSON representation of the NAR structure. The use of skipContents = true aligns with the skip optimization.


148-148: Correct use of skipContents flag.

Setting skipContents = true enables the skip optimization while still capturing metadata via preallocateContents.


171-185: LGTM! Clean recursive rendering.

The renderNarListing function provides a clear hierarchical display of NAR contents.


187-198: LGTM! Clean flag structure.

The --no-contents flag provides a clear way to disable content listing for performance.


229-235: LGTM! Clean JSON accumulator design.

The optional json member and listContents flag provide clean control over JSON mode and content listing.


257-272: LGTM! Efficient content handling.

The code efficiently skips NAR content when not needed (source.skip) and conditionally parses when listing contents.


306-315: LGTM! Clean JSON output.

The final JSON output includes version metadata and accumulated path information.


153-153: The size field is always populated before line 153 is reached.

In parseContents (src/libutil/archive.cc, lines 129-138), the parser calls sink.preallocateContents(size) unconditionally at line 133, before checking skipContents at line 135. The nario.cc code's override of preallocateContents (lines 137-140) sets the size field, so by the time line 153 executes, the optional is guaranteed to be initialized. There is no unsafe access.

Likely an incorrect or invalid review comment.

src/libutil/include/nix/util/serialise.hh (1)

182-182: The FdSource::skip implementation properly handles non-seekable sources—no action needed.

The implementation at src/libutil/serialise.cc:203-230 correctly addresses all concerns:

  • Catches ESPIPE on failed lseek() (line 220)
  • Sets isSeekable = false to avoid retry (line 221)
  • Falls back to BufferedSource::skip() for read-based skipping (lines 229-230)

The default isSeekable = true is appropriate: it's optimistic but becomes adaptive once an ESPIPE is encountered, ensuring pipes, sockets, and other non-seekable sources work correctly on subsequent operations.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
tests/functional/export.sh (2)

10-10: Fix jq selector to use array indexing.

nix path-info --json returns an array, not an object keyed by path. The current selector will fail.

Apply this diff:

-drvPath=$(nix path-info --json "$outPath" | jq -r .\""$outPath"\".deriver)
+drvPath=$(nix path-info --json "$outPath" | jq -r '.[0].deriver // empty')

68-68: Quote variables in test comparison to prevent globbing.

The unquoted $drvPath on the right-hand side of = in [[ ]] allows glob matching, which can cause unexpected test failures.

Apply this diff:

-[[ $(printf "%s" "$json" | jq -r ".paths.\"$outPath\".deriver") = $drvPath ]]
+[[ $(printf "%s" "$json" | jq -r ".paths.\"$outPath\".deriver") = "$drvPath" ]]
src/libutil/serialise.cc (1)

203-234: Critical: this->read counter not updated when buffered bytes are skipped.

The implementation fails to update this->read when bytes are discarded from the buffer (lines 207-215). When len bytes are removed from the buffer, the counter should be incremented by the number of bytes actually skipped. This breaks any code that relies on FdSource::read to track total bytes consumed.

Apply this diff to fix:

 void FdSource::skip(size_t len)
 {
 #ifndef _WIN32
     /* Discard data in the buffer. */
     if (len && buffer && bufPosIn - bufPosOut) {
+        size_t skipped = 0;
         if (len >= bufPosIn - bufPosOut) {
-            len -= bufPosIn - bufPosOut;
+            skipped = bufPosIn - bufPosOut;
+            len -= skipped;
             bufPosIn = bufPosOut = 0;
         } else {
+            skipped = len;
             bufPosOut += len;
             len = 0;
         }
+        this->read += skipped;
     }
 
     /* If we can, seek forward in the file to skip the rest. */
     if (isSeekable && len) {
         if (lseek(fd, len, SEEK_CUR) == -1) {
             if (errno == ESPIPE)
                 isSeekable = false;
             else
                 throw SysError("seeking forward in file");
         } else {
             read += len;
             return;
         }
     }
 #endif
 
     /* Otherwise, skip by reading. */
     if (len)
         BufferedSource::skip(len);
 }
src/nix/nario.cc (1)

263-272: Use a 64-bit format specifier for narSize.

Line 269 uses %d for info.narSize, which will truncate sizes larger than INT_MAX (2 GB). This can cause incorrect output for large store paths.

Apply this diff:

-                    logger->cout(fmt(ANSI_BOLD "%s:" ANSI_NORMAL " %d bytes", printStorePath(info.path), info.narSize));
+                    logger->cout(fmt(ANSI_BOLD "%s:" ANSI_NORMAL " %zu bytes", printStorePath(info.path), info.narSize));

Note: If narSize is not size_t, use "%" PRIu64 " bytes" with a cast to (uint64_t) instead, and add #include <inttypes.h> at the top.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95ef2dd and 6eb99e8.

📒 Files selected for processing (9)
  • src/libutil/archive.cc (2 hunks)
  • src/libutil/fs-sink.cc (1 hunks)
  • src/libutil/include/nix/util/fs-sink.hh (1 hunks)
  • src/libutil/include/nix/util/serialise.hh (3 hunks)
  • src/libutil/serialise.cc (4 hunks)
  • src/nix/dump-path.cc (3 hunks)
  • src/nix/nario-list.md (1 hunks)
  • src/nix/nario.cc (7 hunks)
  • tests/functional/export.sh (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/libutil/include/nix/util/fs-sink.hh
  • src/nix/dump-path.cc
  • src/libutil/fs-sink.cc
  • src/libutil/archive.cc
🧰 Additional context used
🧬 Code graph analysis (2)
src/nix/nario.cc (1)
src/libstore/export-import.cc (2)
  • importPaths (72-156)
  • importPaths (72-72)
src/libutil/serialise.cc (1)
src/libutil/include/nix/util/serialise.hh (3)
  • buf (330-339)
  • len (101-101)
  • len (206-206)
🪛 Shellcheck (0.11.0)
tests/functional/export.sh

[warning] 68-68: Quote the right-hand side of = in [[ ]] to prevent glob matching.

(SC2053)


[warning] 74-74: Quote the right-hand side of = in [[ ]] to prevent glob matching.

(SC2053)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build_x86_64-linux / manual
  • GitHub Check: build_x86_64-linux / test
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: eval
🔇 Additional comments (15)
src/nix/nario-list.md (1)

8-37: LGTM! Documentation clearly describes the new JSON feature.

The documentation update appropriately describes the new --json flag and provides a clear example of the output format. The structure matches the implementation in src/nix/nario.cc.

src/libutil/include/nix/util/serialise.hh (2)

100-101: LGTM! Clean API addition for efficient skipping.

The skip() method provides a clean abstraction for discarding bytes without reading them into memory. This is essential for the performance improvements mentioned in the PR objectives (7.42s → 0.42s).


182-207: LGTM! Seekability flag enables optimized skipping.

The isSeekable flag allows FdSource to dynamically detect whether the underlying file descriptor supports seeking, enabling the use of lseek() for efficient skipping when possible.

tests/functional/export.sh (1)

69-71: LGTM! JSON structure validation tests.

These tests appropriately validate the JSON structure: checking for directory type, foobar file existence, and correct file size.

src/libutil/serialise.cc (3)

93-104: LGTM! Clean use of auto deduction.

The change to use auto for the variable n improves code clarity without altering functionality.


113-121: LGTM! Correct fallback skip implementation.

The default Source::skip() implementation correctly reads and discards data in chunks. The 64 KB buffer size is appropriate for this use case.


123-138: LGTM! Improved readability with std::min.

Using std::min makes the intent clearer than the ternary operator.

src/nix/nario.cc (8)

9-10: LGTM! JSON library included for new feature.

The nlohmann/json.hpp include is necessary for the JSON output functionality.


63-65: LGTM! Improved error message clarity.

The error message now specifically mentions "nario" which makes it clearer to users what operation is being refused.


103-169: LGTM! Efficient NAR-to-JSON conversion.

The listNar() function elegantly uses a FileSystemObjectSink to parse the NAR and build a JSON structure. Setting skipContents = true (line 148) ensures file contents are skipped rather than read, which is key to the performance improvement.


171-185: LGTM! Clean text rendering from JSON.

The renderNarListing() function provides a clear text representation of the NAR listing by recursively traversing the JSON structure.


187-198: LGTM! Clean MixJSON integration.

The CmdNarioList class properly integrates with MixJSON to provide the --json flag, and adds the --no-contents flag to control whether NAR contents are listed.


228-236: LGTM! Store extended for JSON accumulation.

The ListingStore constructor and members are appropriately updated to support JSON accumulation and content listing control.


254-262: LGTM! Efficient content skipping when not needed.

The logic correctly calls listNar() when listContents is true, and efficiently skips the NAR data using source.skip() when contents are not needed.


304-316: LGTM! Correct JSON output structure.

The run() method correctly initializes the JSON accumulator when --json is used, and outputs the final JSON with the documented structure (version field and paths object).

This allows FdSource to efficiently skip data we don't care about.
This makes it way faster: on a 15 GB system closure, from 7.42s to
0.42s on a cold page cache, and from 5.64s to 0.13s on a hot cache.
With `--json`, it shows information in the same format as `nix store
ls --json` (i.e. the NAR listing format).
@edolstra edolstra enabled auto-merge October 17, 2025 18:48
@github-actions github-actions bot temporarily deployed to pull request October 17, 2025 18:53 Inactive
@edolstra edolstra added this pull request to the merge queue Oct 17, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 17, 2025
@edolstra edolstra added this pull request to the merge queue Oct 18, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 18, 2025
@edolstra edolstra added this pull request to the merge queue Oct 20, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 20, 2025
@edolstra edolstra enabled auto-merge October 20, 2025 11:41
@github-actions github-actions bot temporarily deployed to pull request October 20, 2025 11:45 Inactive
@edolstra edolstra added this pull request to the merge queue Oct 20, 2025
Merged via the queue into main with commit dd7140a Oct 20, 2025
34 checks passed
@edolstra edolstra deleted the export-import branch October 20, 2025 12:12
@coderabbitai coderabbitai bot mentioned this pull request Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants