|
| 1 | +--- |
| 2 | +name: rspack-perf-profiling |
| 3 | +description: Run Rspack performance profiling on Linux using perf (with DWARF call stacks), generate perf.data, and analyze hotspots. Use when you need CPU-level bottlenecks, kernel symbol resolution, or repeatable profiling for rspack build/bench cases. Includes optional samply import with per-CPU threads for visualization, but primary analysis is perf-based. |
| 4 | +--- |
| 5 | + |
| 6 | +# Rspack Perf Profiling |
| 7 | + |
| 8 | +## Overview |
| 9 | + |
| 10 | +Profile Rspack builds on Linux using perf with DWARF call graphs, capture kernel + user stacks, and analyze hotspots directly from perf.data. Optionally import into samply for per-CPU threads visualization. |
| 11 | + |
| 12 | +## Workflow |
| 13 | + |
| 14 | +### 1) Build profiling-enabled binding (once per code change) |
| 15 | + |
| 16 | +```sh |
| 17 | +pnpm run build:binding:profiling |
| 18 | +``` |
| 19 | + |
| 20 | +### 2) Enable kernel symbols (recommended) |
| 21 | + |
| 22 | +```sh |
| 23 | +echo 0 | sudo tee /proc/sys/kernel/kptr_restrict |
| 24 | +echo 1 | sudo tee /proc/sys/kernel/perf_event_paranoid |
| 25 | +``` |
| 26 | + |
| 27 | +Optional: install vmlinux debug symbols or pass a vmlinux path to perf report. |
| 28 | + |
| 29 | +### 3) Record perf profile (example: react-10k case) |
| 30 | + |
| 31 | +```sh |
| 32 | +# if benchmark repo isn't present yet (clone alongside rspack, not inside it) |
| 33 | +git -c http.lowSpeedLimit=1 -c http.lowSpeedTime=600 \ |
| 34 | + clone https://github.com/rstackjs/build-tools-performance.git \ |
| 35 | + ../build-tools-performance |
| 36 | + |
| 37 | +# install benchmark deps (required for react-10k) |
| 38 | +pnpm -C ../build-tools-performance install |
| 39 | + |
| 40 | +# link local rspack core so cases can resolve @rspack/core |
| 41 | +pnpm -C ../build-tools-performance add -w @rspack/core@link:../rspack/packages/rspack |
| 42 | + |
| 43 | +( |
| 44 | + cd ../build-tools-performance/cases/react-10k || exit 1 |
| 45 | + perf record -o ./perf.data \ |
| 46 | + -e cycles:uk -F 4000 --call-graph dwarf -- \ |
| 47 | + node --perf-prof --perf-basic-prof --interpreted-frames-native-stack \ |
| 48 | + ../../../rspack/packages/rspack-cli/bin/rspack.js \ |
| 49 | + -c ./rspack.config.mjs |
| 50 | +) |
| 51 | +``` |
| 52 | + |
| 53 | +Notes: |
| 54 | + |
| 55 | +- `cycles:uk` captures user + kernel cycles. |
| 56 | +- Increase `-F` for higher sample density; expect large perf.data. |
| 57 | +- Ensure `--call-graph dwarf` for readable Rust stacks. |
| 58 | + |
| 59 | +### 4) Analyze perf.data (perf-based) |
| 60 | + |
| 61 | +Top hotspots (flat view): |
| 62 | + |
| 63 | +```sh |
| 64 | +perf report -i ../build-tools-performance/cases/react-10k/perf.data \ |
| 65 | + --stdio --no-children -g none --percent-limit 0.5 | head -n 100 |
| 66 | +``` |
| 67 | + |
| 68 | +Callgraph (if needed): |
| 69 | + |
| 70 | +```sh |
| 71 | +perf report -i ../build-tools-performance/cases/react-10k/perf.data \ |
| 72 | + --stdio --no-children -g graph,0.5,caller,function,percent | head -n 120 |
| 73 | +``` |
| 74 | + |
| 75 | +### 5) Optional: import into samply with per-CPU threads |
| 76 | + |
| 77 | +```sh |
| 78 | +samply import ../build-tools-performance/cases/react-10k/perf.data \ |
| 79 | + --per-cpu-threads -o ../build-tools-performance/cases/react-10k/perf.profile.json.gz \ |
| 80 | + --no-open |
| 81 | +``` |
| 82 | + |
| 83 | +Use this only for visualization; keep analysis perf-first. |
| 84 | + |
| 85 | +## Variants |
| 86 | + |
| 87 | +- For other cases, swap `-c <case>/rspack.config.js`. |
| 88 | +- For heavier workloads, wrap the rspack command in a loop to amplify time. |
| 89 | +- If kernel symbols are still missing, pass `-k /path/to/vmlinux` to perf report. |
0 commit comments