Skip to content

Conversation

@WillAyd
Copy link
Contributor

@WillAyd WillAyd commented Mar 10, 2025

Per comment #2546 (comment) this might help fix the flaky CI failures with meson + ASAN

@github-actions github-actions bot added this to the ADBC Libraries 18 milestone Mar 10, 2025
Copy link
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending CI success, i like it 😄

@lidavidm
Copy link
Member

Seems we need to set more (maybe both CXX and CC have to be set?)

@WillAyd
Copy link
Contributor Author

WillAyd commented Mar 12, 2025

I asked the Meson team about this on their IRC channel, and they mentioned that while gcc uses shared linkage to the ASAN runtime, clang requires static linkage to ASAN. It seems like something is awry with the latter, and it might have to do with how gtest is being used.

I will have to investigate that further, or maybe see if there's a newer version of gcc that can be used instead of clang that still fixes the issue

@WillAyd
Copy link
Contributor Author

WillAyd commented Apr 23, 2025

After some more research, the patch that was applied to clang had already been in place with gcc since 2013. So I think I erred in assuming a newer version would change this. Need to go back to the drawing board...

@WillAyd WillAyd changed the title fix(ci): Prevent flaky ASAN failures with Dremio fix(ci): Skip flaky ASAN failures in Meson Apr 25, 2025
@WillAyd
Copy link
Contributor Author

WillAyd commented Apr 25, 2025

For now I've given up on trying to solve this, instead skipping the flightsql and bigquery tests in ASAN where I think the problem arises

@WillAyd
Copy link
Contributor Author

WillAyd commented Apr 25, 2025

I do wonder if this affects snowflake as well, since bigquery, flightsql and snowflake are all built using custom go targets...

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did find the sanitizer was flaky here without this config (albeit in a different way)

- name: python-debug
run: |
# Need to set this or ASAN inside the container gets stuck
# printing a loop of DEADLYSIGNAL
sudo sysctl vm.mmap_rnd_bits=28
pushd arrow-adbc
docker compose run -e PYTHON=3.13 --rm python-debug

Could this be related?

@WillAyd
Copy link
Contributor Author

WillAyd commented Apr 27, 2025

From personal experience I've found ASAN challenging to use when launched via the Python interpreter. ASAN itself will detect some leaks from most version of CPython, so the only way I have gotten it to reliably run is to turn off leak detection while also making sure to LD_PRELOAD the ASAN library (this is required unless the Python interpreter itself was built with ASAN support)

ASAN_OPTIONS=detect_leaks=0 LD_PRELOAD="$(gcc -print-file-name=libasan.so)" python ...

Without that, I have seen that issue where you end up with an endless loop of ASAN_ERROR:DeadlySignal (or something to the effect) which is the error you are seeing right?

@WillAyd WillAyd merged commit 0c9edfe into apache:main Apr 27, 2025
27 checks passed
@WillAyd WillAyd deleted the fix-meson-ci branch April 27, 2025 14:24
@lidavidm
Copy link
Member

Yeah, we do turn off leak checking and preload ASAN, but I found that I also needed to change the ASLR randomization bits to get ASAN to work consistently

colin-rogers-dbt pushed a commit to dbt-labs/arrow-adbc that referenced this pull request Jun 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants