Skip to content

Conversation

@achow101
Copy link
Member

@achow101 achow101 commented Apr 30, 2024

The DNS seeder that I wrote collects statistics on node reliability in the same way that sipa's seeder does, and also outputs this information in the same file format. Thus it can also be used in our fixed seeds update scripts. My seeder additionally crawls onion v3, i2p, and cjdns, so will now be able to set those fixed seeds automatically rather than curating manual lists.

In doing this update, I've found that makeseeds.py is missing newer versions from the regex as well as cjdns support; both of these have been updated.

I also noticed that the testnet fixed seeds are all manually curated and sipa's seeder does not appear to publish any testnet data. Since I am also running the seeder for testnet, I've added the commands to generate testnet fixed seeds from my seeder's data too.

Lastly, I've updated all of the fixed seeds. However, since my seeder has not found any cjdns nodes that met the reliability criteria (possibly due to connectivity issues present in those networks), I've left the previous manual seeds for that network.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 30, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK fjahr, virtu
Concept ACK jaonoctus

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30695 ([WIP] seeds: Add additional seed source and bump uptime requirements for Onion and I2P nodes by virtu)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@laanwj laanwj added the P2P label Apr 30, 2024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like something odd happened to the manual onion and i2p seeds. Only a small range of first letters were present, and seeds run by colleagues and the bitcoin community were no longer present.

Copy link
Member Author

@achow101 achow101 Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were updated in #29561, and the addresses were pulled from my node's addrman. Some sorting happened somewhere, and because makeseeds.py doesn't shuffle (this PR adds a commit that does that), when it applied the max node count, it ended up with the tail end of that list.

There seem to be sufficient i2p and onion nodes now that there is no need to specifically include nodes run by known people. We don't do this for IPv4 or IPv6.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see that your script removed my onion and i2p nodes then and also here.

@achow101 achow101 force-pushed the my-seeder-fixed-seeds branch 2 times, most recently from f6bc14a to 94e99a9 Compare May 14, 2024 03:44
@achow101
Copy link
Member Author

achow101 commented May 14, 2024

My seeder has now found several i2p nodes, so I've gone ahead and removed the manually curated ones for both mainnet and testnet. These are now filled in by the script. The only manual ones remaining are cjdns. However, as my seeder has also found cjdns nodes, they have been added, but are only a couple.

Update the user agent regex to match all 3 digits of the version number,
not just the first 2 digits.

Also updates it to include 24.2, 25.2, 26.1, 27.0, 27.1, 27.99, 28.0 and
28.99.
The crawlers are not guaranteed to output nodes in a random order, so
shuffle the ips list after parsing to break any biasing that may be
caused by the output order.
@achow101 achow101 force-pushed the my-seeder-fixed-seeds branch from 6f915b9 to 7f55140 Compare August 14, 2024 17:22
@achow101
Copy link
Member Author

achow101 commented Aug 14, 2024

Updated for testnet4. I don't think there are any seeders publishing testnet4 data yet, so I've just used the fixed seeds in chainparamsseeds.h and turned them into a nodes_testnet4.txt, in addition to adding instructions.

Also refreshed the seeds for pre-28.0.

@achow101 achow101 added this to the 28.0 milestone Aug 14, 2024
Copy link
Contributor

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK 7f55140007186cda876ad0a5da812e391cddbcc4

I reviewed the code and the changes look correct to me. I tested that the updated instructions in the README work as expected and they did (though see my comment on the testnet4 line). The resulting files showed some reasonable differences from the files included here, which is expected. I also confirmed that generating chainparamsseeds.h from the txt files included here yields the same result.

The seeders now produce onion and i2p seeds, so there is no need to keep these
in the manual list.

Although should also be produced, there are not enough
good ones detected by the seeder, so we keep the manual seeds for them.
Update the fixed seeds for both mainnet and testnet
@achow101 achow101 force-pushed the my-seeder-fixed-seeds branch from 7f55140 to d8fd1e0 Compare August 16, 2024 15:29
@fjahr
Copy link
Contributor

fjahr commented Aug 17, 2024

@virtu
Copy link
Contributor

virtu commented Aug 19, 2024

ACK 41ad84a

Reviewed the code; changes look fine. Also in favor of using the regularly-updated asmap from collaborative runs.

I noticed the seeds.txt.gz file I used (~2024-08-19T08:30Z) file did not contain any good I2P nodes.

Also, since I've been working on on exporting my crawler's results as well, I noticed the Onion node numbers seem rather low. Here's some statistics that I generated by skipping the final stage of the makeseeds.py script (so as to not apply the per-AS and per-network limit, thus retaining all viable nodes), applying the script to all input sources individually, and comparing the resulting addresses.

Network Addresses (before limits) Overlap
IPv4 sipa=1714, ava=2886, virtu=4554 sipa-ava=704, sipa-virtu=1684, ava-virtu=2355, sipa-ava-virtu=687
IPv6 sipa=469, ava=583, virtu=1203 sipa-ava=178, sipa-virtu=460, ava-virtu=523, all=176
Onion ava=443, virtu=11384 ava-virtu=388
CJDNS ava=2, virtu=9 ava-virtu=2

@achow101
Copy link
Member Author

I noticed the seeds.txt.gz file I used (~2024-08-19T08:30Z) file did not contain any good I2P nodes.

I think it got disconnected from I2P and started marking every I2P node as down when it was actually that the crawler's I2P connection was down. That's something I'll need to fix.

Copy link

@jaonoctus jaonoctus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@achow101 achow101 merged commit 37cdb5f into bitcoin:master Aug 26, 2024
@virtu
Copy link
Contributor

virtu commented Aug 27, 2024

just rebased #30695 since #30008 got merged and noticed it accidentally removes all hardcoded onion and i2p seeds in src/chainparamsseeds.h (and seeds_main.txt).

[edited for clarity]

@fjahr
Copy link
Contributor

fjahr commented Aug 27, 2024

it accidentally removes all hardcoded onion and i2p seeds

Wasn't that the plan? From the description:

My seeder additionally crawls onion v3, i2p, and cjdns, so will now be able to set those fixed seeds automatically rather than curating manual lists.

@virtu
Copy link
Contributor

virtu commented Aug 27, 2024

it accidentally removes all hardcoded onion and i2p seeds

Wasn't that the plan? From the description:

My seeder additionally crawls onion v3, i2p, and cjdns, so will now be able to set those fixed seeds automatically rather than curating manual lists.

"hardcoded" was a poor choice, I guess. Updated my OP to avoid further confusion.

I wasn't referring to the removal of the manually curated addresses in nodes_main_manual.txt but to the fact that there won't be any Onion or I2P seeds hardcoded into the binary at all (via src/chainparamsseeds.h), which I believe is an accident.

@achow101
Copy link
Member Author

I wasn't referring to the removal of the manually curated addresses in nodes_main_manual.txt but to the fact that there won't be any Onion or I2P seeds hardcoded into the binary at all (via src/chainparamsseeds.h), which I believe is an accident.

Yes, just noticed that too.

I've added #30695 to the milestone to get those added back in.

achow101 added a commit that referenced this pull request Aug 27, 2024
…rements for Onion and I2P nodes

b061b35 seeds: Regenerate mainnet seeds (virtu)
02dc45c seeds: Pull nodes from Luke's seeder (virtu)
7a2068a seeds: Pull nodes from virtu's crawler (virtu)

Pull request description:

  This builds on #30008 and adds data [exported](https://github.com/virtu/seed-exporter) by [my crawler](https://github.com/virtu/p2p-crawler) an additional source for seed nodes. Data covers all supported network types.

  [edit: Added Luke's seeder as input as well.]

  ### Motivation
  - Further decentralizes the seed node selection process (in the long term potentially enabling an _n_-source threshold for nodes to prevent a single source from entering malicious nodes)
  - No longer need to manually curate seed node list for any network type: See last paragraph of OP in #30008. My crawler has been [discovering the handful of available cjdns nodes](https://21.ninja/reachable-nodes/nodes-by-net-type/) for around two months, all but one of which meet the reliability criteria.
  - Alignment of uptime requirements for Onion and I2P nodes with those of clearnet nodes to 50%: If I'm reading the code correctly, seeders appear to optimize for up-to-dateness by using [lower connection timeouts](https://github.com/achow101/dnsseedrs/blob/3c1a63c6723819871d76fe0fbd2155fe5a5bb171/src/crawl.rs#L349) than [Bitcoin Core](https://github.com/bitcoin/bitcoin/blob/bc87ad98543299e1990ee1994d0653df3ac70093/src/netbase.cpp#L40C27-L40C48) to maximize throughput. Since my crawler does not have the same timeliness requirements, it opts for accuracy by using generous timeouts. As a result, its data contains additional eligible Onion (and other darknet nodes), as is shown in the histogram below. Around 4500 Onion nodes are discovered so far (blue); my data adds ~6400 more (orange); ~ 1500 nodes take longer than the default 20-second Bitcoin Core timeout and won't qualify as "good".

  ![Connection time histogram for Onion nodes](https://github.com/user-attachments/assets/c3513604-aa48-4c75-b51d-13421eaed9eb)

  Here's the current results with 512 nodes for all networks except cjdns:
  <details>
  <summary>Using the extra data</summary>

  ```
  IPv4   IPv6  Onion  I2P    CJDNS Pass
  10335   2531  11545   1589     10 Initial
  10335   2531  11545   1589     10 Skip entries with invalid address
  5639   1431  11163   1589      8 After removing duplicates
  5606   1417  11163   1589      8 Enforce minimal number of blocks
  5606   1417  11163   1589      8 Require service bit 1
  4873   1228  11163   1589      8 Require minimum uptime
  4846   1225  11161   1588      8 Require a known and recent user agent
  4846   1225  11161   1588      8 Filter out hosts with multiple bitcoin ports
  512    512    512    512      8 Look up ASNs and limit results per ASN and per net
  ```
  </details>
  <details>
  <summary>Before</summary>

  ```
  IPv4   IPv6  Onion  I2P    CJDNS Pass
  5772   1323    443      0      2 Initial
  5772   1323    443      0      2 Skip entries with invalid address
  4758   1110    443      0      2 After removing duplicates
  4723   1094    443      0      2 Enforce minimal number of blocks
  4723   1094    443      0      2 Require service bit 1
  3732    867    443      0      2 Require minimum uptime
  3718    864    443      0      2 Require a known and recent user agent
  3718    864    443      0      2 Filter out hosts with multiple bitcoin ports
   512    409    443      0      2 Look up ASNs and limit results per ASN and per net
  ```
  </details>

  ### To dos
  - [x] Remove manual nodes and update README
  - [x] Mark nodes with connection times exceeding Bitcoin Core's default as bad in [exporter](https://github.com/virtu/seed-exporter): [done](virtu/seed-exporter#12)
  - [x] Regenerate mainnet seeds
  - [x] Rebase, then remove WIP label once #30008 gets merged

ACKs for top commit:
  achow101:
    ACK b061b35
  fjahr:
    utACK b061b35

Tree-SHA512: 63e86220787251c7e8d2d5957bad69352e19ae17d7b9b2d27d8acddfec5bdafe588edb68d77d19c57f25f149de723e2eeadded0c8cf13eaca22dc33bd8cf92a0
glozow added a commit that referenced this pull request Mar 11, 2025
…blocks, fixed seeds

f0b6597 seeds: update .gitignore with signet and testnet4 (Jon Atack)
48f07ac chainparams: remove hardcoded signet seeds (Jon Atack)
d4ab115 chainparams: add signet fixed seeds if default network (Jon Atack)
49f155e seeds: update fixed dns seeds (Jon Atack)
2366870 makeseeds: regex improvements (Lőrinc)
98f84d6 generate-seeds: update and add signet (Jon Atack)
c4ed23e seeds: add testnet4 seeds (Jon Atack)
60f17dd seeds: add signet seeds (Jon Atack)
2bcccaa makeseeds: align I2P column header (Jon Atack)
94e21aa makeseeds: update MIN_BLOCKS, add reminder to README (Jon Atack)
6ae7a3b makeseeds: update user agent regex (Jon Atack)
9b0d2e5 makeseeds: fix incorrect regex (laanwj)

Pull request description:

  In `makeseeds.py`:
  - fix the user agent regex (by laanwj)
  - fix the I2P column header spacing
  - update the regex (it was also not updated for the previous release)
  - update `MIN_BLOCKS` (4320 blocks/month * ~6.5 months) and add README documentation to remember to update it
  - further robustness/standardness/consistency improvements to the regexes (by l0rinc)

  Add signet and testnet4 seeds to the README and to `generate-seeds.py`

  Update the fixed seeds in `src/chainparamsseeds.h`

  In `kernel/chainparams.cpp`:
  - add signet fixed seeds if default network
  - remove hardcoded signet seeds

  Update `contrib/seeds/.gitignore` with signet and testnet4

  The previous 2 seeds updates were #30008 and #30695.

  mainnet:
  ```
  $ contrib/seeds$ python3 makeseeds.py -a asmap-filled.dat -s seeds_main.txt > nodes_main.txt

  Loading asmap database "asmap-filled.dat"…Done.
  Loading and parsing DNS seeds…Done.
    IPv4   IPv6  Onion    I2P  CJDNS Pass
   17252   3630  21079   3095     12 Initial
   17252   3630  21079   3095     12 Skip entries with invalid address
    8444   1742  14607   2330     10 After removing duplicates
    8194   1691  14321   2102     10 Enforce minimal number of blocks
    7838   1578  14321   2102     10 Require service bit 1
    6802   1326  14321   2102     10 Require minimum uptime
    6762   1321  14320   2102     10 Require a known and recent user agent
    6762   1321  14320   2102     10 Filter out hosts with multiple bitcoin ports
     512    485    512    512     10 Look up ASNs and limit results per ASN and per net
  ```

  signet:
  ```
  $ contrib/seeds$ python3 makeseeds.py -a asmap-filled.dat -s seeds_signet.txt -m 237800 > nodes_signet.txt

  Loading asmap database "asmap-filled.dat"…Done.
  Loading and parsing DNS seeds…Done.
    IPv4   IPv6  Onion    I2P  CJDNS Pass
     110     47     63      9      4 Initial
     110     47     63      9      4 Skip entries with invalid address
     110     47     63      9      4 After removing duplicates
      83     31     58      9      4 Enforce minimal number of blocks
      83     31     58      9      4 Require service bit 1
      83     31     57      9      4 Require minimum uptime
      83     31     57      9      4 Require a known and recent user agent
      83     31     57      7      4 Filter out hosts with multiple bitcoin ports
      42     30     57      7      4 Look up ASNs and limit results per ASN and per net
  ```

  testnet:
  ```
  $ contrib/seeds$ python3 makeseeds.py -a asmap-filled.dat -s seeds_test.txt > nodes_test.txt

  Loading asmap database "asmap-filled.dat"…Done.
  Loading and parsing DNS seeds…Done.
    IPv4   IPv6  Onion    I2P  CJDNS Pass
     204     73     96     11      5 Initial
     204     73     96     11      5 Skip entries with invalid address
     204     73     96     11      5 After removing duplicates
     204     73     96     11      5 Enforce minimal number of blocks
     204     73     96     11      5 Require service bit 1
     195     69     87      9      5 Require minimum uptime
     193     69     87      9      5 Require a known and recent user agent
     193     69     87      9      5 Filter out hosts with multiple bitcoin ports
      79     39     87      9      5 Look up ASNs and limit results per ASN and per net
  ```

  testnet4
  ```
  $ contrib/seeds$ python3 makeseeds.py -a asmap-filled.dat -s seeds_testnet4.txt -m 72600 > nodes_testnet4.txt

  Loading asmap database "asmap-filled.dat"…Done.
  Loading and parsing DNS seeds…Done.
    IPv4   IPv6  Onion    I2P  CJDNS Pass
     149    115     69     11      4 Initial
     149    115     69     11      4 Skip entries with invalid address
     149    115     69     11      4 After removing duplicates
     104     75     52      7      4 Enforce minimal number of blocks
     104     75     52      7      4 Require service bit 1
     100     73     51      7      4 Require minimum uptime
     100     73     51      7      4 Require a known and recent user agent
     100     73     51      7      4 Filter out hosts with multiple bitcoin ports
      43     46     51      7      4 Look up ASNs and limit results per ASN and per net
  ```

ACKs for top commit:
  l0rinc:
    I have mostly reviewed the regexes, for the rest it's only a very lightweight ACK f0b6597
  achow101:
    ACK f0b6597
  laanwj:
    re-ACK f0b6597

Tree-SHA512: 86f4ea247469dbb3f131f2de884e470fbf93f399744d4854fcc26511afafcec231d7eaed37f8564244bc64d917d130b314d948aa97b13020613f8e186c70e368
@bitcoin bitcoin locked and limited conversation to collaborators Aug 27, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants