Skip to content

Conversation

@m-kuhn
Copy link
Contributor

@m-kuhn m-kuhn commented Feb 23, 2025

Adds two build options to disable vendored dependencies fmt and nanoarrow

  • WITH_VENDORED_FMT
  • WITH_VENDORED_NANOARROW

The default will still use the vendored copies

Adds two build options to disable vendored dependencies fmt and nanoarrow

 - WITH_VENDORED_FMT
 - WITH_VENDORED_NANOARROW
@m-kuhn m-kuhn requested a review from lidavidm as a code owner February 23, 2025 17:56
@github-actions github-actions bot modified the milestone: ADBC Libraries 17 Feb 23, 2025
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David will have to speak to the build system since I am not an expert, but at least conceptually I have done this for nanoarrow for a few projects I've worked on. It might be helpful to specify a version (I think >= 0.5.0 would do it but nanoarrow is pre-1.0).

I also wonder if there's a way to use a pre-built driver (e.g., Python, R, a pre-built .so) in the project you are working on? The R and Python packages I believe both provide a way to locate the shared library (which may reduce you build to just the driver manager). Apologies if I'm missing something obvious!

@lidavidm
Copy link
Member

Ah, is this for the vcpkg port? If so thank you for taking a stab at it.

I agree with Dewey's suggestion of prefixing the options, otherwise this makes sense. And we should add a CI config to test it (I can follow up with that)

@m-kuhn
Copy link
Contributor Author

m-kuhn commented Feb 24, 2025

Thanks for all the suggestions. Yes, it is for vcpkg, I am also working on nanoarrow and fmt ports, see microsoft/vcpkg#43721

It might be helpful to specify a version (I think >= 0.5.0 would do it but nanoarrow is pre-1.0).

If 0.5.0 is the correct version, I can add it.

I also wonder if there's a way to use a pre-built driver (e.g., Python, R, a pre-built .so) in the project you are working on?

There's also a python port which could/shoule be used eventually. R is not bundled in the project I'm working on so I'll leave this exercise for someone else.

And we should add a CI config to test it (I can follow up with that)

CI tests would be great, what are your plans, building on top of the vcpkg dependencies or something else?

@m-kuhn m-kuhn changed the title Allow to build with system dependencies (chore) allow to build with system dependencies Feb 24, 2025
@m-kuhn m-kuhn changed the title (chore) allow to build with system dependencies chore(c dependencies) allow to build with system dependencies Feb 24, 2025
@m-kuhn m-kuhn changed the title chore(c dependencies) allow to build with system dependencies chore(c dependencies): allow to build with system dependencies Feb 24, 2025
@lidavidm lidavidm changed the title chore(c dependencies): allow to build with system dependencies build(c): allow to build with system dependencies Feb 24, 2025
@lidavidm
Copy link
Member

CI tests would be great, what are your plans, building on top of the vcpkg dependencies or something else?

I think having one pipeline that enables vcpkg and tries to build with all deps from vcpkg would be the starting point. Let's get this building first though

@m-kuhn
Copy link
Contributor Author

m-kuhn commented Feb 24, 2025

I can't explain the failing tests, in my opinion the libraries built by the pipelines should be completely equal to before (as the default is still to build with vendored dependencies). Does someone have a hint where to start looking?

@paleolimbot
Copy link
Member

Does someone have a hint where to start looking?

Again, the CMake here isn't something I'm familiar with, but you could try flipping the default (i.e., make building against system opt-in) in case there are some builds that for some reason don't see the default options). Something like ADBC_NANOARROW_SOURCE=SYSTEM would be consistent with how Arrow C++ handles this kind of thing.

@lidavidm
Copy link
Member

I kicked all the jobs just to see.

Something seems off in that it appears SQLite tests are failing in odd ways...

@lidavidm
Copy link
Member

Like this - I wonder if something else in the build environment changed?

[ RUN      ] SqliteReaderTest.InferTypedParams
/home/runner/work/arrow-adbc/arrow-adbc/c/driver/sqlite/sqlite_test.cc:446: Failure
Expected equality of these values:
  0
  sqlite3_prepare_v2(db, query.c_str(), query.size(), &stmt, nullptr)
    Which is: 1

/home/runner/work/arrow-adbc/arrow-adbc/c/driver/sqlite/sqlite_test.cc:801: Failure
Expected: Exec(R"(INSERT INTO foo VALUES (0, "foo"), (1, NULL), (2, 4), (3, 1E2))") doesn't generate new fatal failures in the current thread.
  Actual: it does.

@m-kuhn
Copy link
Contributor Author

m-kuhn commented Feb 24, 2025

Last successful build here: https://github.com/apache/arrow-adbc/actions/runs/13381287796/job/37370182406#step:8:53

-- Found SQLite3: /home/runner/miniconda3/envs/test/include (found version "3.48.0")

vs this one
https://github.com/apache/arrow-adbc/actions/runs/13493324155/job/37705363691?pr=2546#step:8:53

-- Found SQLite3: /home/runner/miniconda3/envs/test/include (found version "3.49.1")

@m-kuhn
Copy link
Contributor Author

m-kuhn commented Feb 24, 2025

Note to self: the nanoarrow target is linked to various drivers in an unnamespaced version, this should probably be done once in the common target next to nanoarrow::nanoarrow, this might reduce the overall footprint of this dependency in variou CMakeList.txt files.

I'll continue with this cleanup effort once the tests are restored.

P.S. it might be as easy as renaming the target here to unify this with the namespaced version

add_library(
nanoarrow
STATIC
nanoarrow.c
)
set_target_properties(
nanoarrow PROPERTIES POSITION_INDEPENDENT_CODE ON
)

@lidavidm
Copy link
Member

I see this is breaking elsewhere so let me investigate, thanks

@lidavidm
Copy link
Member

Appears to be #2555

@lidavidm
Copy link
Member

Alright, hopefully if you rebase these problems should be fixed!

@m-kuhn m-kuhn closed this Feb 25, 2025
@m-kuhn m-kuhn reopened this Feb 25, 2025
@lidavidm
Copy link
Member

Filed #2557 to track adding the CI config

@m-kuhn
Copy link
Contributor Author

m-kuhn commented Feb 25, 2025

@lidavidm any idea for the last failing test?
Please don't merge yet, I have some last adjustments to make

@lidavidm
Copy link
Member

The last failing test is known flaky (I need to go in and disable ASAN for Meson or otherwise figure out how to replicate it...)

@m-kuhn
Copy link
Contributor Author

m-kuhn commented Feb 26, 2025

The last failing test is known flaky (I need to go in and disable ASAN for Meson or otherwise figure out how to replicate it...)

For the record, LD_PRELOAD of libSegFault might help to give a stacktrace

@WillAyd
Copy link
Contributor

WillAyd commented Feb 26, 2025

I believe the ASAN issue is an upstream bug, at least according to:

llvm/llvm-project#87324

It is confirmed to be flaky, so not sure how much luck we will have with a reproducer.

Does it just affect Dremio? I wonder if we should have the ASAN test run exclude that driver for now. I'd personally hate to disable the entire ASAN suite if the scope of the issue can be limited to one driver

@lidavidm
Copy link
Member

Looks like this is green again - is this OK to re-review/merge?

@m-kuhn
Copy link
Contributor Author

m-kuhn commented Feb 27, 2025

Yes , good from my side

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paleolimbot seems we should upstream this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this CMakeLists.txt lives in ADBC...is the suggestion that nanoarrow should include a CMakeLists.txt with the bundled nanoarrow.c/h? It may also be time to just use CMake install or build-time dependency for nanoarrow (since our existing CMakeLists.txt now supports that and is tested against a number of scenarios).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, I see then.

@lidavidm lidavidm merged commit 15fb244 into apache:main Mar 3, 2025
66 of 67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants