The E2E action intermittently fails building the aiperf-bench image:
#5 [2/3] RUN pip install --no-cache-dir "aiperf==0.7.0" ...
#5 0.183 runc run failed: unable to start container process: error during container init:
exec: "/bin/sh": stat /bin/sh: no such file or directory
Root cause
The docker build in .github/actions/e2e/action.yml:104 lacks an explicit --platform flag.
Without it, Docker BuildKit may resolve the wrong architecture manifest for the python:3.12-slim multi-arch image, producing a layer tree where /bin/sh doesn't exist for the host architecture.
Why only aiperf-bench is affected:
- Go validator images (deployment, performance, conformance) are COPY-only into distroless — no
RUN steps, so /bin/sh is never invoked
- Release workflows (
on-push.yaml, on-tag.yaml) use docker/build-push-action with explicit platforms: matrix values
Failed run
https://github.com/NVIDIA/aicr/actions/runs/24904478145
Fix
Add --platform linux/amd64 to the E2E docker build for aiperf-bench.
The E2E action intermittently fails building the
aiperf-benchimage:Root cause
The
docker buildin.github/actions/e2e/action.yml:104lacks an explicit--platformflag.Without it, Docker BuildKit may resolve the wrong architecture manifest for the
python:3.12-slimmulti-arch image, producing a layer tree where/bin/shdoesn't exist for the host architecture.Why only aiperf-bench is affected:
RUNsteps, so/bin/shis never invokedon-push.yaml,on-tag.yaml) usedocker/build-push-actionwith explicitplatforms:matrix valuesFailed run
https://github.com/NVIDIA/aicr/actions/runs/24904478145
Fix
Add
--platform linux/amd64to the E2Edocker buildfor aiperf-bench.