-
Notifications
You must be signed in to change notification settings - Fork 38.6k
seeds: Pull additional nodes from my seeder and update fixed seeds #30008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
contrib/seeds/nodes_main_manual.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like something odd happened to the manual onion and i2p seeds. Only a small range of first letters were present, and seeds run by colleagues and the bitcoin community were no longer present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They were updated in #29561, and the addresses were pulled from my node's addrman. Some sorting happened somewhere, and because makeseeds.py doesn't shuffle (this PR adds a commit that does that), when it applied the max node count, it ended up with the tail end of that list.
There seem to be sufficient i2p and onion nodes now that there is no need to specifically include nodes run by known people. We don't do this for IPv4 or IPv6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I see that your script removed my onion and i2p nodes then and also here.
f6bc14a to
94e99a9
Compare
|
My seeder has now found several i2p nodes, so I've gone ahead and removed the manually curated ones for both mainnet and testnet. These are now filled in by the script. The only manual ones remaining are cjdns. However, as my seeder has also found cjdns nodes, they have been added, but are only a couple. |
94e99a9 to
6f915b9
Compare
Update the user agent regex to match all 3 digits of the version number, not just the first 2 digits. Also updates it to include 24.2, 25.2, 26.1, 27.0, 27.1, 27.99, 28.0 and 28.99.
The crawlers are not guaranteed to output nodes in a random order, so shuffle the ips list after parsing to break any biasing that may be caused by the output order.
6f915b9 to
7f55140
Compare
|
Updated for testnet4. I don't think there are any seeders publishing testnet4 data yet, so I've just used the fixed seeds in chainparamsseeds.h and turned them into a nodes_testnet4.txt, in addition to adding instructions. Also refreshed the seeds for pre-28.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tACK 7f55140007186cda876ad0a5da812e391cddbcc4
I reviewed the code and the changes look correct to me. I tested that the updated instructions in the README work as expected and they did (though see my comment on the testnet4 line). The resulting files showed some reasonable differences from the files included here, which is expected. I also confirmed that generating chainparamsseeds.h from the txt files included here yields the same result.
The seeders now produce onion and i2p seeds, so there is no need to keep these in the manual list. Although should also be produced, there are not enough good ones detected by the seeder, so we keep the manual seeds for them.
Update the fixed seeds for both mainnet and testnet
7f55140 to
d8fd1e0
Compare
|
re-ACK 41ad84a Only changes were addressing above (minor) comments: https://github.com/bitcoin/bitcoin/compare/7f55140007186cda876ad0a5da812e391cddbcc4..41ad84a00c20f54b520aab7f6f975231da0ee2d0 |
|
ACK 41ad84a Reviewed the code; changes look fine. Also in favor of using the regularly-updated asmap from collaborative runs. I noticed the Also, since I've been working on on exporting my crawler's results as well, I noticed the Onion node numbers seem rather low. Here's some statistics that I generated by skipping the final stage of the
|
I think it got disconnected from I2P and started marking every I2P node as down when it was actually that the crawler's I2P connection was down. That's something I'll need to fix. |
jaonoctus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
Wasn't that the plan? From the description:
|
"hardcoded" was a poor choice, I guess. Updated my OP to avoid further confusion. I wasn't referring to the removal of the manually curated addresses in |
Yes, just noticed that too. I've added #30695 to the milestone to get those added back in. |
…rements for Onion and I2P nodes b061b35 seeds: Regenerate mainnet seeds (virtu) 02dc45c seeds: Pull nodes from Luke's seeder (virtu) 7a2068a seeds: Pull nodes from virtu's crawler (virtu) Pull request description: This builds on #30008 and adds data [exported](https://github.com/virtu/seed-exporter) by [my crawler](https://github.com/virtu/p2p-crawler) an additional source for seed nodes. Data covers all supported network types. [edit: Added Luke's seeder as input as well.] ### Motivation - Further decentralizes the seed node selection process (in the long term potentially enabling an _n_-source threshold for nodes to prevent a single source from entering malicious nodes) - No longer need to manually curate seed node list for any network type: See last paragraph of OP in #30008. My crawler has been [discovering the handful of available cjdns nodes](https://21.ninja/reachable-nodes/nodes-by-net-type/) for around two months, all but one of which meet the reliability criteria. - Alignment of uptime requirements for Onion and I2P nodes with those of clearnet nodes to 50%: If I'm reading the code correctly, seeders appear to optimize for up-to-dateness by using [lower connection timeouts](https://github.com/achow101/dnsseedrs/blob/3c1a63c6723819871d76fe0fbd2155fe5a5bb171/src/crawl.rs#L349) than [Bitcoin Core](https://github.com/bitcoin/bitcoin/blob/bc87ad98543299e1990ee1994d0653df3ac70093/src/netbase.cpp#L40C27-L40C48) to maximize throughput. Since my crawler does not have the same timeliness requirements, it opts for accuracy by using generous timeouts. As a result, its data contains additional eligible Onion (and other darknet nodes), as is shown in the histogram below. Around 4500 Onion nodes are discovered so far (blue); my data adds ~6400 more (orange); ~ 1500 nodes take longer than the default 20-second Bitcoin Core timeout and won't qualify as "good".  Here's the current results with 512 nodes for all networks except cjdns: <details> <summary>Using the extra data</summary> ``` IPv4 IPv6 Onion I2P CJDNS Pass 10335 2531 11545 1589 10 Initial 10335 2531 11545 1589 10 Skip entries with invalid address 5639 1431 11163 1589 8 After removing duplicates 5606 1417 11163 1589 8 Enforce minimal number of blocks 5606 1417 11163 1589 8 Require service bit 1 4873 1228 11163 1589 8 Require minimum uptime 4846 1225 11161 1588 8 Require a known and recent user agent 4846 1225 11161 1588 8 Filter out hosts with multiple bitcoin ports 512 512 512 512 8 Look up ASNs and limit results per ASN and per net ``` </details> <details> <summary>Before</summary> ``` IPv4 IPv6 Onion I2P CJDNS Pass 5772 1323 443 0 2 Initial 5772 1323 443 0 2 Skip entries with invalid address 4758 1110 443 0 2 After removing duplicates 4723 1094 443 0 2 Enforce minimal number of blocks 4723 1094 443 0 2 Require service bit 1 3732 867 443 0 2 Require minimum uptime 3718 864 443 0 2 Require a known and recent user agent 3718 864 443 0 2 Filter out hosts with multiple bitcoin ports 512 409 443 0 2 Look up ASNs and limit results per ASN and per net ``` </details> ### To dos - [x] Remove manual nodes and update README - [x] Mark nodes with connection times exceeding Bitcoin Core's default as bad in [exporter](https://github.com/virtu/seed-exporter): [done](virtu/seed-exporter#12) - [x] Regenerate mainnet seeds - [x] Rebase, then remove WIP label once #30008 gets merged ACKs for top commit: achow101: ACK b061b35 fjahr: utACK b061b35 Tree-SHA512: 63e86220787251c7e8d2d5957bad69352e19ae17d7b9b2d27d8acddfec5bdafe588edb68d77d19c57f25f149de723e2eeadded0c8cf13eaca22dc33bd8cf92a0
…blocks, fixed seeds f0b6597 seeds: update .gitignore with signet and testnet4 (Jon Atack) 48f07ac chainparams: remove hardcoded signet seeds (Jon Atack) d4ab115 chainparams: add signet fixed seeds if default network (Jon Atack) 49f155e seeds: update fixed dns seeds (Jon Atack) 2366870 makeseeds: regex improvements (Lőrinc) 98f84d6 generate-seeds: update and add signet (Jon Atack) c4ed23e seeds: add testnet4 seeds (Jon Atack) 60f17dd seeds: add signet seeds (Jon Atack) 2bcccaa makeseeds: align I2P column header (Jon Atack) 94e21aa makeseeds: update MIN_BLOCKS, add reminder to README (Jon Atack) 6ae7a3b makeseeds: update user agent regex (Jon Atack) 9b0d2e5 makeseeds: fix incorrect regex (laanwj) Pull request description: In `makeseeds.py`: - fix the user agent regex (by laanwj) - fix the I2P column header spacing - update the regex (it was also not updated for the previous release) - update `MIN_BLOCKS` (4320 blocks/month * ~6.5 months) and add README documentation to remember to update it - further robustness/standardness/consistency improvements to the regexes (by l0rinc) Add signet and testnet4 seeds to the README and to `generate-seeds.py` Update the fixed seeds in `src/chainparamsseeds.h` In `kernel/chainparams.cpp`: - add signet fixed seeds if default network - remove hardcoded signet seeds Update `contrib/seeds/.gitignore` with signet and testnet4 The previous 2 seeds updates were #30008 and #30695. mainnet: ``` $ contrib/seeds$ python3 makeseeds.py -a asmap-filled.dat -s seeds_main.txt > nodes_main.txt Loading asmap database "asmap-filled.dat"…Done. Loading and parsing DNS seeds…Done. IPv4 IPv6 Onion I2P CJDNS Pass 17252 3630 21079 3095 12 Initial 17252 3630 21079 3095 12 Skip entries with invalid address 8444 1742 14607 2330 10 After removing duplicates 8194 1691 14321 2102 10 Enforce minimal number of blocks 7838 1578 14321 2102 10 Require service bit 1 6802 1326 14321 2102 10 Require minimum uptime 6762 1321 14320 2102 10 Require a known and recent user agent 6762 1321 14320 2102 10 Filter out hosts with multiple bitcoin ports 512 485 512 512 10 Look up ASNs and limit results per ASN and per net ``` signet: ``` $ contrib/seeds$ python3 makeseeds.py -a asmap-filled.dat -s seeds_signet.txt -m 237800 > nodes_signet.txt Loading asmap database "asmap-filled.dat"…Done. Loading and parsing DNS seeds…Done. IPv4 IPv6 Onion I2P CJDNS Pass 110 47 63 9 4 Initial 110 47 63 9 4 Skip entries with invalid address 110 47 63 9 4 After removing duplicates 83 31 58 9 4 Enforce minimal number of blocks 83 31 58 9 4 Require service bit 1 83 31 57 9 4 Require minimum uptime 83 31 57 9 4 Require a known and recent user agent 83 31 57 7 4 Filter out hosts with multiple bitcoin ports 42 30 57 7 4 Look up ASNs and limit results per ASN and per net ``` testnet: ``` $ contrib/seeds$ python3 makeseeds.py -a asmap-filled.dat -s seeds_test.txt > nodes_test.txt Loading asmap database "asmap-filled.dat"…Done. Loading and parsing DNS seeds…Done. IPv4 IPv6 Onion I2P CJDNS Pass 204 73 96 11 5 Initial 204 73 96 11 5 Skip entries with invalid address 204 73 96 11 5 After removing duplicates 204 73 96 11 5 Enforce minimal number of blocks 204 73 96 11 5 Require service bit 1 195 69 87 9 5 Require minimum uptime 193 69 87 9 5 Require a known and recent user agent 193 69 87 9 5 Filter out hosts with multiple bitcoin ports 79 39 87 9 5 Look up ASNs and limit results per ASN and per net ``` testnet4 ``` $ contrib/seeds$ python3 makeseeds.py -a asmap-filled.dat -s seeds_testnet4.txt -m 72600 > nodes_testnet4.txt Loading asmap database "asmap-filled.dat"…Done. Loading and parsing DNS seeds…Done. IPv4 IPv6 Onion I2P CJDNS Pass 149 115 69 11 4 Initial 149 115 69 11 4 Skip entries with invalid address 149 115 69 11 4 After removing duplicates 104 75 52 7 4 Enforce minimal number of blocks 104 75 52 7 4 Require service bit 1 100 73 51 7 4 Require minimum uptime 100 73 51 7 4 Require a known and recent user agent 100 73 51 7 4 Filter out hosts with multiple bitcoin ports 43 46 51 7 4 Look up ASNs and limit results per ASN and per net ``` ACKs for top commit: l0rinc: I have mostly reviewed the regexes, for the rest it's only a very lightweight ACK f0b6597 achow101: ACK f0b6597 laanwj: re-ACK f0b6597 Tree-SHA512: 86f4ea247469dbb3f131f2de884e470fbf93f399744d4854fcc26511afafcec231d7eaed37f8564244bc64d917d130b314d948aa97b13020613f8e186c70e368
The DNS seeder that I wrote collects statistics on node reliability in the same way that sipa's seeder does, and also outputs this information in the same file format. Thus it can also be used in our fixed seeds update scripts. My seeder additionally crawls onion v3, i2p, and cjdns, so will now be able to set those fixed seeds automatically rather than curating manual lists.
In doing this update, I've found that
makeseeds.pyis missing newer versions from the regex as well as cjdns support; both of these have been updated.I also noticed that the testnet fixed seeds are all manually curated and sipa's seeder does not appear to publish any testnet data. Since I am also running the seeder for testnet, I've added the commands to generate testnet fixed seeds from my seeder's data too.
Lastly, I've updated all of the fixed seeds. However, since my seeder has not found any cjdns nodes that met the reliability criteria (possibly due to connectivity issues present in those networks), I've left the previous manual seeds for that network.