Skip to content

Comments

fix: integration / Archery test With other arrows container ran out of space#9043

Merged
Jefffrey merged 2 commits intoapache:mainfrom
lyang24:debug-integration-memory
Dec 27, 2025
Merged

fix: integration / Archery test With other arrows container ran out of space#9043
Jefffrey merged 2 commits intoapache:mainfrom
lyang24:debug-integration-memory

Conversation

@lyang24
Copy link
Contributor

@lyang24 lyang24 commented Dec 25, 2025

Which issue does this PR close?

Rationale for this change

the ci container starts with 63gb / 72gb used, the 9GB remaining disk space is barely enough for a cross build in 7 languages that leads to ci being stuck.

this is what a debug step after initialize container shows
=== CONTAINER DISK USAGE ===
Filesystem Size Used Avail Use% Mounted on
overlay 72G 63G 9.5G 87% /

What changes are included in this PR?

  • add resource monitoring to build process
  • add a clean up step to remove unnecessary software (cuts 6GB of space)
    === Cleaning up host disk space ===
    Disk space before cleanup:
    Filesystem Size Used Avail Use% Mounted on
    overlay 72G 63G 9.5G 87% /

Disk space after cleanup:
Filesystem Size Used Avail Use% Mounted on
overlay 72G 57G 16G 79% /

  • add a small optimization to shallow clone (only clone most recent commit not full history) for github repos

optimization results we have 6.1 GB left after build

=== After Build ===
Filesystem Size Used Avail Use% Mounted on
overlay 72G 66G 6.1G 92% /

Are these changes tested?

tested by github ci

Are there any user-facing changes?

no

@lyang24
Copy link
Contributor Author

lyang24 commented Dec 25, 2025

https://github.com/apache/arrow-rs/actions/runs/20496632452/job/58897236702?pr=9043

the problem is fs container's fs is full before the cp -a command after js build.

the easiest way out is spend more money on a larger machine :(

let me look into some disk optimizations opportunities

=== BEFORE CP: Memory and Disk ===
total used free shared buff/cache available
Mem: 15Gi 1.4Gi 2.4Gi 242Mi 11Gi 13Gi
Swap: 4.0Gi 0.0Ki 4.0Gi
Filesystem Size Used Avail Use% Mounted on
overlay 72G 72G 408M 100% /

@lyang24 lyang24 changed the title fix: fix integration test With other arrows ci fix: integration test With other arrows ci used 100% of fs writable space Dec 25, 2025
@Jefffrey
Copy link
Contributor

I know in DataFusion we run into similar disk space usage issues often and one fix was to manually remove unneeded software:

But given the image for integration tests is custom and from arrow, not sure how effective it might be 🤔

@lyang24 lyang24 force-pushed the debug-integration-memory branch from f991723 to 6c5e377 Compare December 25, 2025 02:10
@lyang24
Copy link
Contributor Author

lyang24 commented Dec 25, 2025

I know in DataFusion we run into similar disk space usage issues often and one fix was to manually remove unneeded software:

But given the image for integration tests is custom and from arrow, not sure how effective it might be 🤔

Yes i think that should be the approach here as well. After investigating further the runner container is polluted starting with 63gb/72gb before the checkout step.

@lyang24 lyang24 force-pushed the debug-integration-memory branch 9 times, most recently from aaa0b8d to 4d53b77 Compare December 25, 2025 10:37
save disk: host clean up and git shallow clone.

Signed-off-by: lyang24 <[email protected]>
@lyang24 lyang24 force-pushed the debug-integration-memory branch from 4d53b77 to af1354e Compare December 25, 2025 11:03
@lyang24
Copy link
Contributor Author

lyang24 commented Dec 25, 2025

I know in DataFusion we run into similar disk space usage issues often and one fix was to manually remove unneeded software:

But given the image for integration tests is custom and from arrow, not sure how effective it might be 🤔

was able to get enough space after mimic data fusion by adding a clean up to delete unneed software

@lyang24 lyang24 changed the title fix: integration test With other arrows ci used 100% of fs writable space fix: integration / Archery test With other arrows container ran out of space Dec 25, 2025
Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work debugging this 🎉

@Jefffrey Jefffrey merged commit 8ed2b52 into apache:main Dec 27, 2025
39 checks passed
@Jefffrey
Copy link
Contributor

Thanks @lyang24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

integration / Archery test With other arrows (push) CI check failing on main

2 participants