Skip to content

Non-deterministic archives on Linux due to timestamps #1696

@weihanglo

Description

@weihanglo

Problem

cc-rs seems to produce non-deterministic archives on Linux because GNU ar embeds the current timestamp in each archive member header. This leads to cache misses for some content-addressable-based caching tool.

Possible solutions

There is a D mode for deterministic archive https://sourceware.org/binutils/docs/binutils/ar-cmdline.html#index-deterministic-archives:

‘D’

Operate in deterministic mode. When adding files and the archive index use zero for UIDs, GIDs, timestamps, and use consistent file modes for all files. When this option is used, if ar is used with identical options and identical input files, multiple runs will create identical output files regardless of the input files’ owners, groups, file modes, or modification times.

If binutils was configured with --enable-deterministic-archives, then this mode is on by default. It can be disabled with the ‘U’ modifier, below.

I found that we have workaround for macOS already (the ZERO_AR_DATE=1 hack). Wonder we should have similiar hack also for GNU ar or any ar that supports D.

Repro script

This can reproduces the non-deterministic archieve file:

cat > Cargo.toml << EOF
[package]
name = "repro"
edition = "2024"

[build-dependencies]
cc = "=1.2.58"
EOF

mkdir -p src
echo "fn main() {}" > src/main.rs

cat > build.rs << 'EOF'
fn main() {
    cc::Build::new().file("foo.c").compile("foo");
}
EOF

cat > foo.c << 'EOF'
int add(int a, int b) { return a + b; }
int sub(int a, int b) { return a - b; }
EOF

# Build twice, compare the .a files
cargo build --release 2>/dev/null
cp target/release/build/repro-*/out/libfoo.a /tmp/build1.a

cargo clean
sleep 2
cargo build --release 2>/dev/null
cp target/release/build/repro-*/out/libfoo.a /tmp/build2.a

# .o files are identical, but .a differs due to timestamps
rm -rf /tmp/ar1 /tmp/ar2
mkdir -p /tmp/ar1 /tmp/ar2
cd /tmp/ar1 && ar x /tmp/build1.a
cd /tmp/ar2 && ar x /tmp/build2.a
diff <(cd /tmp/ar1 && md5sum *.o | sort) <(cd /tmp/ar2 && md5sum *.o | sort)  # identical
cmp /tmp/build1.a /tmp/build2.a  # DIFFER

If you run the script above with the patch below, the last cmp line will be identical

diff --git a/src/lib.rs b/src/lib.rs
index 3bf3d81..422947c 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -2730,7 +2730,7 @@ impl Build {
             // NOTE: We add `s` even if flags were passed using $ARFLAGS/ar_flag, because `s`
             // here represents a _mode_, not an arbitrary flag. Further discussion of this choice
             // can be seen in https://github.com/rust-lang/cc-rs/pull/763.
-            run(ar.arg("s").arg(dst), &self.cargo_output)?;
+            run(ar.arg("sD").arg(dst), &self.cargo_output)?;
         }

         Ok(())
@@ -2786,7 +2786,7 @@ impl Build {
             // NOTE: We add cq here regardless of whether $ARFLAGS/ar_flag have been used because
             // it dictates the _mode_ ar runs in, which the setter of $ARFLAGS/ar_flag can't
             // dictate. See https://github.com/rust-lang/cc-rs/pull/763 for further discussion.
-            run(cmd.arg("cq").arg(dst).args(objs), &self.cargo_output)?;
+            run(cmd.arg("cqD").arg(dst).args(objs), &self.cargo_output)?;
         }

         Ok(())

Though I believe this is not the patch we want as ar from some toolchain may not support D.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions