Skip to content

Commit 8b46c58

Browse files
authored
Merge branch 'main' into test/tee-write-error-broken-pipe
2 parents 5aeb7fd + 03f160c commit 8b46c58

File tree

18 files changed

+676
-33
lines changed

18 files changed

+676
-33
lines changed

.vscode/cspell.dictionaries/workspace.wordlist.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,7 @@ ENOSYS
128128
ENOTEMPTY
129129
EOPNOTSUPP
130130
EPERM
131+
EPIPE
131132
EROFS
132133

133134
# * vars/fcntl

Cargo.lock

Lines changed: 14 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -600,12 +600,14 @@ lto = true
600600
[profile.release-fast]
601601
inherits = "release"
602602
panic = "abort"
603+
codegen-units = 1
603604

604605
# A release-like profile that is as small as possible.
605606
[profile.release-small]
606607
inherits = "release"
607608
opt-level = "z"
608609
panic = "abort"
610+
codegen-units = 1
609611
strip = true
610612

611613
# A release-like profile with debug info, useful for profiling.

src/uu/cp/BENCHMARKING.md

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
<!-- spell-checker:ignore hyperfine tmpfs reflink fsxattr xattrs clonefile vmtouch APFS pathlib Btrfs fallocate journaling -->
2+
3+
# Benchmarking cp
4+
5+
`cp` copies file contents together with metadata such as permissions, ownership,
6+
timestamps, extended attributes, and directory structures. Although copying
7+
looks simple, `cp` exercises many filesystem features. Its performance depends
8+
heavily on the workload shape (large sequential files, many tiny files, special
9+
files, sparse images) and the storage stack underneath.
10+
11+
## Understanding cp
12+
13+
Most of the time spent inside `cp` falls into two broad categories:
14+
15+
- **Data transfer path**: When copying large contiguous files, throughput is
16+
dominated by read/write bandwidth. The overhead from `cp` itself comes from
17+
performing buffered reads and writes, copying memory between buffers, and the
18+
number of system calls issued per block.
19+
- **Metadata handling**: When recursively copying trees with thousands of small
20+
files, performance is limited by metadata work such as `open`, `stat`,
21+
`lstat`, attribute preservation, directory creation, and link handling.
22+
23+
`cp` supports many switches that alter these paths, including attribute
24+
preservation, hard-link and reflink creation, sparse detection, and
25+
`--remove-destination` semantics. Benchmarks should call out which pathways are
26+
being exercised so results can be interpreted correctly.
27+
28+
## Benchmarking guidelines
29+
30+
- Build a release binary first: `cargo build --release -p uu_cp`.
31+
- Use `hyperfine` for timing and rely on the `--prepare` hook to reset state
32+
between runs.
33+
- Prefer running on a fast device (RAM disk, tmpfs, NVMe) to minimize raw
34+
storage latency when isolating the cost of the tool.
35+
- On Linux, control the page cache where appropriate using tools like
36+
`vmtouch` or `echo 3 > /proc/sys/vm/drop_caches` (root required). Prioritize
37+
repeatability and stay within the policies of the host system.
38+
- Keep the workload definition explicit. When comparing against GNU `cp` or
39+
other implementations, ensure identical datasets and mount options.
40+
41+
## Large-file throughput
42+
43+
1. Create a clean working directory and reduce cache interference.
44+
2. Generate an input file of known size, for example with `truncate` or `dd`.
45+
3. Run repeated copies with `hyperfine`, deleting the destination beforehand.
46+
47+
```shell
48+
mkdir -p benchmark/cp && cd benchmark/cp
49+
truncate -s 2G input.bin
50+
hyperfine \
51+
--warmup 2 \
52+
--prepare 'rm -f output.bin' \
53+
'../target/release/cp input.bin output.bin'
54+
```
55+
56+
What to record:
57+
58+
- Achieved throughput (MB/s) for large sequential copies.
59+
- Behavior with `--reflink=auto` or `--sparse=auto` on filesystems that
60+
support copy-on-write or sparse regions.
61+
- CPU overhead when enabling attribute preservation such as
62+
`--preserve=mode,timestamps,xattr`.
63+
64+
If the underlying filesystem performs transparent copy-on-write (for example,
65+
APFS via `clonefile`), consider running the same benchmark with `--reflink=never`
66+
or on a filesystem without reflink support to measure raw data transfer.
67+
68+
## Many small files
69+
70+
Large directory trees stress metadata throughput. Pre-create a synthetic tree
71+
and copy it recursively.
72+
73+
```shell
74+
mkdir -p dataset/src
75+
python3 - <<'PY'
76+
from pathlib import Path
77+
root = Path('dataset/src')
78+
for i in range(2000):
79+
sub = root / f'dir_{i//200}'
80+
sub.mkdir(parents=True, exist_ok=True)
81+
for j in range(5):
82+
path = sub / f'file_{i}_{j}.txt'
83+
path.write_text('payload' * 16)
84+
PY
85+
hyperfine \
86+
--warmup 1 \
87+
--prepare 'rm -rf dataset/dst && mkdir -p dataset/dst' \
88+
'../target/release/cp -r dataset/src dataset/dst'
89+
```
90+
91+
What to record:
92+
93+
- Time spent in directory traversal and metadata replication.
94+
- Impact of toggling options such as `--preserve`, `--no-preserve`, `--link`,
95+
`--hard-link`, and `--archive`.
96+
- Behavior when symbolic links or hard links are present, especially with
97+
`--dereference` versus `--no-dereference`.
98+
99+
## Copy-on-write and sparse files
100+
101+
`--reflink=always` can dramatically reduce work on Btrfs, XFS, APFS, and other
102+
reflink-aware filesystems. Compare results with `--reflink=never` to understand
103+
how much time is spent in copy-on-write system calls versus fallback copying.
104+
Sparse workloads benefit from dedicated benchmarks as well.
105+
106+
```shell
107+
truncate -s 4G sparse.img
108+
fallocate -d sparse.img # On filesystems that support punching holes
109+
hyperfine \
110+
--prepare 'rm -f sparse-copy.img' \
111+
'../target/release/cp --sparse=always sparse.img sparse-copy.img'
112+
```
113+
114+
Check both the elapsed time and the on-disk size of the destination (for
115+
example using `du -h sparse-copy.img`) to confirm sparse regions are preserved.
116+
117+
## Evaluating attribute preservation and extras
118+
119+
Measure the incremental cost of individual options by enabling them one at a
120+
time:
121+
122+
- Test `--preserve=context` or `--preserve=xattr` on files that actually carry
123+
extended attributes.
124+
- Evaluate ACL and SELinux handling with `--archive` on systems where those
125+
features are active.
126+
- Compare modes that remove or back up the destination (`--remove-destination`,
127+
`--backup=numbered`) to see the impact of extra file operations.
128+
129+
Supplementary analysis with `strace -c` or `perf record` can show which system
130+
calls dominate and guide optimization work.
131+
132+
## Interpreting results
133+
134+
- If a benchmark completes in well under a second, increase the dataset size to
135+
reduce process start-up noise.
136+
- Document filesystem features such as journaling, compression, or encryption
137+
that may skew results.
138+
- When changes are made to `cp`, track how system call counts, I/O patterns,
139+
and CPU time shift between runs to catch regressions early.
140+
141+
Use these guidelines to isolate the workloads you care about (large sequential
142+
transfers, directory-heavy copies, attribute preservation, reflink paths) and
143+
collect reproducible measurements.

src/uu/cp/Cargo.toml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,15 @@ exacl = { workspace = true, optional = true }
4747
name = "cp"
4848
path = "src/main.rs"
4949

50+
[dev-dependencies]
51+
divan = { workspace = true }
52+
tempfile = { workspace = true }
53+
uucore = { workspace = true, features = ["benchmark"] }
54+
55+
[[bench]]
56+
name = "cp_bench"
57+
harness = false
58+
5059
[features]
5160
feat_selinux = ["selinux", "uucore/selinux"]
5261
feat_acl = ["exacl"]

0 commit comments

Comments
 (0)