Skip to content

Conversation

@fjahr
Copy link
Contributor

@fjahr fjahr commented Nov 4, 2023

Depends on #33878. This is the third in a tripled of PRs that implement the necessary changes for embedding of asmap data into the binary. This last part add the initial asmap data, implements the build changes and adds further documentation.

Currently an asmap file needs to be acquired by there user from some location or the user needs to generate one themselves. Then they need to move the file to the right place in datadir or pass the path to the file as -asmap=PATH in order to use the asmap feature. The change here allows for builds to embed asmap data into the bitcoind binary which makes it possible to use the feature without handling of the asmap file by the user. If the user starts bitcoind with -asmap the embedded data will be used for bucketing of nodes.

The data lives in the repository at src/node/data/ip_asn.dat and can be replaced with a new version at any time. The idea is that the data should be updated with every release. By default the data at that location is embedded into the binary but there is also a build option to prevent this (-DWITH_EMBEDDED_ASMAP=OFF). In this case the original behavior of the -asmap option is maintained.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 4, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/28792.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK Sjors, jonatack, laanwj, hodlinator
Stale ACK sipa

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #33974 (cmake: Check dependencies after build option interaction by hebasto)
  • #33966 (refactor: disentangle miner startup defaults from runtime options by Sjors)
  • #33878 (refactor, docs: Embedded ASMap [2/3]: Refactor asmap internals and add documentation by fjahr)
  • #17783 (common: Disallow calling IsArgSet() on ALLOW_LIST options by ryanofsky)
  • #17581 (refactor: Remove settings merge reverse precedence code by ryanofsky)
  • #17580 (refactor: Add ALLOW_LIST flags and enforce usage in CheckArgFlags by ryanofsky)
  • #17493 (util: Forbid ambiguous multiple assignments in config file by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

  • contrib/asmapand -> contrib/asmap and [missing space between backtick and "and", which makes the sentence run together and slightly harder to read]
  • creator of used ASMap file -> creator of the used ASMap file OR creator of the ASMap file used [missing article / awkward phrasing that makes the sentence grammatically off]

2025-11-25

@fjahr fjahr mentioned this pull request Nov 4, 2023
4 tasks
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch 3 times, most recently from b8b09a7 to f9288bc Compare November 4, 2023 23:48
@sipa
Copy link
Member

sipa commented Nov 5, 2023

Would it make sense to only check in the asmap.dat file into the repository, and then have the build system convert it to a .h at compile time (similar to the approach used in the src/test/data directory)? That would mean there is also a directly-manipulable file in the repository (that can be used with asmap-tool etc).

@maflcko
Copy link
Member

maflcko commented Nov 5, 2023

How much data would this add per year to the repo, assuming updates happen?

@sipa
Copy link
Member

sipa commented Nov 5, 2023

@maflcko The asmap file added here is around 1.6 MiB, so assuming we update for every 6 months, probably a bit over 3 MiB per year (gzip compresses asmap files only ~5%).

@luke-jr
Copy link
Member

luke-jr commented Nov 8, 2023

Why is the option of loading it from a file even in dev builds, not considered?

@fjahr
Copy link
Contributor Author

fjahr commented Nov 9, 2023

Would it make sense to only check in the asmap.dat file into the repository, and then have the build system convert it to a .h at compile time (similar to the approach used in the src/test/data directory)? That would mean there is also a directly-manipulable file in the repository (that can be used with asmap-tool etc).

Yeah, that would work as well. I have drafted that approach here: fjahr@0f9109f Details can change of course (like keeping the raw file in init) but if people prefer that approach, I will pull this into here.

@fjahr
Copy link
Contributor Author

fjahr commented Nov 9, 2023

Why is the option of loading it from a file even in dev builds, not considered?

It is considered. Loading from a file is the only option for dev builds if we go the other route mentioned in the description: "The alternative approach is not having the data in the source code but only adding it during the build phase of a release." But this also comes with the aforementioned downsides.

@willcl-ark
Copy link
Member

A general question about this approach: do most contemporary OSes handle loading large amounts of (potentially) unused data into physical memory well-enough, that they would (likely) not load this extra 1.5-3MB of data in the case that the -asmap is not being used? Perhaps this will be handled by the demand paging system? If not, it seems to be a bit of a shame to increase binary size by 20% in all cases.

It seems to make most sense to me to take the path suggested by @sipa, including the .dat file in the repo, and having it converted to a .h at compile time. It might also be nice to have a configuration flag to disable the embedded header creation for faster builds.

@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from f9288bc to 3bcc1ca Compare December 14, 2023 23:24
@DrahtBot DrahtBot mentioned this pull request Jun 25, 2024
@fjahr fjahr changed the title Embedding ASMap files as binary dump header file Embed default ASMap as binary dump header file Aug 27, 2024
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from 3bcc1ca to be1e04a Compare August 28, 2024 22:25
@fjahr fjahr marked this pull request as ready for review August 28, 2024 22:26
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from be1e04a to 5167707 Compare August 28, 2024 23:31
@fjahr
Copy link
Contributor Author

fjahr commented Aug 28, 2024

I am picking this back up since we have come a long way in terms of tooling outside of Bitcoin Core and lately I was just waiting for cmake and v28 feature freeze. I will suggest an early merge in the cycle for v29 so more people are encouraged to use one steady file over a longer period of time.

I am now using the suggested approach of building the header file at compile time and I added the functionality for both autotools and cmake (the autotools one has existed for a while, I can also remove it if people think it's unnecessary). Of course this is rebased to make cmake available. First time I am editing the cmake build system so there are probably things to improve... And there is also a command for asmap-tool.py to generate the header file.

I am adding the latest (1724248800) asmap file from asmap-data, the corresponding creation process can be reviewed in the issue and PR.

I have not added a config option to skip the asmap header file but I will look into that next.

@Sjors
Copy link
Member

Sjors commented Aug 29, 2024

I am now using the suggested approach of building the header file at compile time

I think the suggestion was to not include the header itself in that case.

the autotools one has existed for a while, I can also remove it if people think it's unnecessary

Maybe keep it in until #30664 is merged.

@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from 5167707 to e5a25fd Compare August 29, 2024 18:33
@fjahr
Copy link
Contributor Author

fjahr commented Nov 19, 2025

Rebased since #33026 was merged, of course still based on #33878.

@fjahr
Copy link
Contributor Author

fjahr commented Nov 20, 2025

Only rebased again

Copy link
Contributor

@hodlinator hodlinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK 1dcd2e8

I do understand the distaste at adding 1MB+ binary blobs into a reproducible build. From that perspective, it would be cleaner to include the ASMap blob as a separate file included in our Bitcoin Core releases.

But having a separate file opens up the can of worms of where to install it on various operating systems and distributions, how to encourage package maintainers to include and ensure it ends up in the right locations, and making Bitcoin Core search an ever changing list of directories to check for ASMap file existence (or requiring always specifying -noasmap or -asmap=/path/on/distro/x/asmap.dat). That would hamper broad ASMap adoption, especially if we want to make it default-on in the future.

Might be good to add a blurb about the consequences of not embedding to the PR description.

Compared build times between WITH_EMBEDDED_ASMAP=OFF/ON.

Generation times unaffected

OFF

$ ccache -C && rm -rf build && time cmake -B build --preset dev-mode -DENABLE_IPC=OFF -DBUILD_GUI=OFF -GNinja -DWITH_EMBEDDED_ASMAP=OFF

real	0m18.115s
real	0m18.091s
real	0m18.027s

ON

$ ccache -C && rm -rf build && time cmake -B build --preset dev-mode -DENABLE_IPC=OFF -DBUILD_GUI=OFF -GNinja -DWITH_EMBEDDED_ASMAP=ON

real	0m18.050s
real	0m17.949s
real	0m17.995s

Compile times fairly unaffected

OFF

$ ccache -C && rm -rf build && cmake -B build --preset dev-mode -DENABLE_IPC=OFF -DBUILD_GUI=OFF -GNinja -DWITH_EMBEDDED_ASMAP=OFF && \
time cmake --build build -t bitcoind

real	2m28.611s
real	2m27.686s
real	2m26.425s

ON

$ ccache -C && rm -rf build && cmake -B build --preset dev-mode -DENABLE_IPC=OFF -DBUILD_GUI=OFF -GNinja -DWITH_EMBEDDED_ASMAP=ON && \
time cmake --build build -t bitcoind

real	2m29.730s
real	2m29.345s
real	2m30.751s
real	2m28.495s
real	2m26.334s

@@ -13,4 +13,4 @@ export PACKAGES="python3-zmq python3-pip clang-17 llvm-17 libc++abi-17-dev libc+
export PIP_PACKAGES="--break-system-packages pycapnp"
export DEP_OPTS="NO_WALLET=1 CC=clang-17 CXX='clang++-17 -stdlib=libc++'"
export GOAL="install"
export BITCOIN_CONFIG="-DREDUCE_EXPORTS=ON -DBUILD_UTIL_CHAINSTATE=ON -DBUILD_KERNEL_LIB=ON -DBUILD_KERNEL_TEST=ON -DBUILD_SHARED_LIBS=ON"
export BITCOIN_CONFIG="-DREDUCE_EXPORTS=ON -DBUILD_UTIL_CHAINSTATE=ON -DBUILD_KERNEL_LIB=ON -DBUILD_KERNEL_TEST=ON -DBUILD_SHARED_LIBS=ON -DWITH_EMBEDDED_ASMAP=NO"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description nit:

- -DWITH_EMBEDDED_ASMAP=NO
+ -DWITH_EMBEDDED_ASMAP=OFF

fjahr and others added 7 commits November 25, 2025 00:24
Calculate the asmap version only in one place: A dedicated function in util/asmap.

The version was also referred to as asmap checksum in several places. To avoid confusion call it asmap version everywhere.
This prevents holding the asmap data in memory twice.

The version hash changes due to spans being serialized without their size-prefix (unlike vectors).
Also makes minor improvement on the python implementation documentation.
This aligns it more with SanityCheckAsmap and reduces variable scope.

Also unify asmap casing in SanityCheckAsmap function name.

Co-authored-by: Hodlinator <[email protected]>
This can be disabled with -DWITH_EMBEDDED_ASMAP=OFF.
@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from 1dcd2e8 to 4821e4b Compare November 25, 2025 00:59
@fjahr
Copy link
Contributor Author

fjahr commented Nov 25, 2025

Addressed comments and rebased after #33770 was merged including the latest changes in #33878. This PR now necessarily re-adds the possibility to set a bool option for -asmap to make it possible to use asmap without an external file. I have done this a bit differently now and I think it's better.

Re @hodlinator #28792 (comment): I don't think I understand your comment. I am pretty sure I have addressed your previous comment to use OFF but the file was renamed in the meantime which I pulled in here with my previous rebase. This is probably why I can't answer to your comment inline. Can you clarify if this isn't addressed yet somehow or if I am misunderstanding something? Thanks!

@fjahr fjahr force-pushed the 2023-10-asmap-in-source branch from 4821e4b to 89b87c5 Compare November 25, 2025 01:03
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Task lint: https://github.com/bitcoin/bitcoin/actions/runs/19654544240/job/56288400840
LLM reason (✨ experimental): CI failed due to a lint error from ruff: an f-string without placeholders in the Python tests (feature_asmap).

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@hodlinator
Copy link
Contributor

Re @hodlinator ... Can you clarify if this isn't addressed yet somehow or if I am misunderstanding something? Thanks!

Update to the main thread in case others are waiting for me to respond to the above - I responded in the thread: #28792 (comment) and the issue was resolved.

Other reviewers: please have a look at the PR that should be merged ahead of this one: #33878

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.