p2p: Remove fGetAddr flag #19794

mzumsande · 2020-08-24T21:27:50Z

fGetAddr is an old flag that indicates an ongoing GETADDR/ADDR interaction. Introduced in dd51920, it prevents further relay of received ADDR answers to other peers via the RelayAddress() gossip mechanism while set - probably to prevent ADDR traffic from being dominated by large GETADDR answers.

It also used to be possible to receive a large GETADDR answer spread over multiple ADDR messages (each bounded to max 1000 addresses), so the condition if (vAddr.size() < 1000) pfrom.fGetAddr = false; was meant to ensure that fGetAddr would only get cleared once the last ADDR of the package was received (typically with < 1000 addresses).

However, fGetAddr currently does not work as originally intended and is no longer useful:

typically, the first ADDR we receive from a peer after sending the GETADDR is not the GETADDR response but the self-announcement of our peer. If this is the case, fGetAddr acts on it (preventing its relay) and is already inactivated once the GETADDR response is received.
5cbf753 introduced the condition vAddr.size() <= 10, so messages with more than 10 elements won't be relayed further regardless.
since Limit the number of addressses for broadcast to accumulate #5419, we do not get more than one ADDR message in response to GETADDR because vAddrToSend cannot have more than 1000 elements.

Removing the flag will cause initial self-announcement from outbound peers to be gossip-relayed.

Thanks to jnewbery for help with reconstructing the history of fGetAddr!

[Outdated, but left in for context]
Removing the flag will change behaviour in two special cases:
1.) If the GETADDR response is exactly of size 1000, fGetAddr will not not be cleared on current master, so that the next "regular" ADDR message by this peer won't be relayed - this does not seem intentional, changing it arguably fixes a bug.
2.) If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size.

jnewbery · 2020-08-24T21:54:57Z

concept ACK

DrahtBot · 2020-08-25T07:20:56Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

p2p: Unify Send and Receive protocol versions #17785 (p2p: Unify Send and Receive protocol versions by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

jnewbery

Looks good. Just two minor style comments inline.

You should update the commit log to include your excellent description of why it's ok to remove this.

src/net_processing.cpp

mzumsande · 2020-08-25T20:32:23Z

Thanks for the review! I pushed an update, implementing both style suggestions and copying the PR description into the commit message.

naumenkogs · 2020-08-26T11:56:58Z

ACK ef9855580098654cdfc5844f8922518e6602878f.

It's good to get rid of the useless code. Even though it's only a couple lines, the burden of this variable is massive: I spent 10 minutes figuring what does it do, even after reading the PR description.

I tried to make sure we don't miss any corner cases, and I think this PR works correctly.

Regarding your commit message:

2.) If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size.

I'm not sure it's a bug in master. I always thought of <=10 check as something to limit the forwarding of unsolicited ADDRs. Now it may also forward a solicited ADDR. May it leak privacy somehow? I guess not, because we don't send them out immediately anyway.

As a side note, is this possible?

Victim node V starts and connects to their peers
An attacker spams V with a lot of unsolicited ADDR messages (of size <10 but from different Sybils) to make sure all ADDRs V sends to their peers is of size 11 at least.
V's self-announcements (and other unsolicited ADDRs from V) are never forwarded by their peers, so only immediate peers ever learn about V.

This PR doesn't address this issue as a whole, but it removes one out of two mechanisms making this possible I believe :)
I think we should look further and see if we can fix it in full, perhaps in another PR.

mzumsande · 2020-08-26T22:22:38Z

I'm not sure it's a bug in master. I always thought of <=10 check as something to limit the forwarding of unsolicited ADDRs. Now it may also forward a solicited ADDR. May it leak privacy somehow? I guess not, because we don't send them out immediately anyway.

I just wanted to note that it is a small change in behaviour, not sure if it is a bug. In any case, I don't think that this will occur often, because with MAX_PCT_ADDR_TO_SEND = 23 it would mean that an honest peer would need to have a really small addrman with less than 50 addresses to send such a small response - usually, GETADDR responses should be considerably larger.

Regarding your scenario:
I think this is a very interesting scenario and probably worth addressing (V's address could still be part of a GETADDR response though, even if it not forwarded).
I don't see though how this PR helps with this even a little. On current master, fGetAddr is set in the VERSION processing with an outbound peer, and immediately unset as soon as we get a single ADDR with size < 1000 from the peer (and can never be set again for the duration of the connection) - while your scenario entails repeated messages.

laanwj · 2020-08-27T11:36:06Z

Though I had an idea what it was for, I don't think I've ever thoroughly understood this flag. Thanks for the explanation and ACK on removing it.
ACK ef9855580098654cdfc5844f8922518e6602878f

jnewbery · 2020-09-01T08:47:51Z

ACK the code changes in ef9855580098654cdfc5844f8922518e6602878f

If you touch this again, could you line-wrap the body of your commit log at 80 chars? (also no need to credit me in the commit log - this is your work!)

amitiuttarwar · 2020-09-03T15:21:17Z

strong concept ACK! thank you! that variable is a distraction.

mzumsande · 2020-09-03T18:51:25Z

Rebased and reformatted the commit message as suggested by @jnewbery .

naumenkogs · 2020-09-04T06:55:15Z

ACK d3eaaf4

Another behavior change, or rather an elaboration over "If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size."

We just connected to the peer and send them a GETADDR
Before getting GETADDR, they schedule their self-announcement for us by adding their addr to vAddrToSend
Their ADDR sending is triggered (see pto->m_next_addr_send < current_time)
We receive their self-announcement containing only 1 addr and ~~keep fGetAddr=true~~ set fGetAddr=false
They finally receive our GETADDR and schedule their response (can be flipped with (4))

I think this order of events is unlikely (need much luck for (3) to happen before (5)) but possible.

In that case,

before this PR, we would not forward their self-announcement, because fGetAddr=true at the time we receive it (we update it right at the end of processing it)
after this PR, we would forward their self-announcement

I think this is actually an improvement. I don't see any reason to withhold an initial announcement of the peer instead of forwarding it to other nodes.

Even if this had any privacy benefits as a side-effect (e.g., somehow less revealing of new connections), it had a very low chance of working. Maybe this should be a real feature tho.

jnewbery · 2020-09-04T07:39:08Z

ACK d3eaaf4

amitiuttarwar · 2020-09-04T23:23:56Z

ACK d3eaaf4

CI failures look unrelated.

sdaftuar · 2020-09-04T23:49:19Z

since #5419, we do not get more than one ADDR message in response to GETADDR because vAddrToSend cannot have more than 1000 elements.

If I'm understanding you correctly, you're saying that we can't get more than one ADDR message in response because our own software wouldn't do it. But what about other software on the network?

At any rate, I think the reason we have this variable is to track whether an ADDR message is likely in response to a getaddr message we send, in which case we don't want to relay the learned addresses to other peers, or if an addr message is an unsolicited relay from our peer, in which case we also want to try to relay. (The reason to not relay messages with more than 10 entries is to rate limit the addr relay, so that a spammer can't use up all the network's bandwidth by blasting out addr messages.)

I think one downside to relaying getaddr responses is a privacy leak - if you started up with a small addrman, and learned only a small set of nodes to connect to from one peer (and relayed that whole thing to another peer), that other peer might have a pretty good idea of who you're going to be connecting to.

Anyway I'd be hesitant to remove this bool without talking through how we view this scenario, as it seems odd to me to relay addresses that are in response to a getaddr message, since we expect those to be of a very different nature than unsolicited addresses being relayed to us, which you expect to be originating with some peer announcing itself. (I wish the response to a getaddr was a different message type than an unsolicited address being relayed to us!)

amitiuttarwar · 2020-09-05T00:28:47Z

@sdaftuar hmm interesting, I'm trying to evaluate this concern, do I have the scenario right:

Node A starts up, connects to Node B
When Node A receives the VERSION message, it sends a GETADDR message to Node B
Node B has an addrman with < 43 addresses, fulfills the request, sends back 9 addresses randomly selected from its addrman (based on 23 percent calculation)
Node A receives an ADDR message with 9 addresses, passes misc checks in ProcessMessage, invokes RelayAddress, which then sends it out to 1 or 2 nodes? (if I'm understanding the role of nRelayNodes correctly), including Node C

and then your concern is at this point, that Node C is able to deduce private information? the connection between Node A and Node B? or more generally what Node A's options for connections are?

mzumsande · 2020-09-05T07:02:22Z

I just realized that fGetAddr seems to be completely broken/ineffective after setting up a node with extra logging for fGetAddr on mainnet with current master:

The problem is that the first ADDR we receive in the lifetime of a typical connection is the initial self-announcement of our peer! After receiving this ADDR message of size 1, fGetAddr will be cleared - it wasn't active a single time by the time I received the actual GETADDR response from a peer. This is exactly what @naumenkogs described in his earlier comment (#19794 (comment)), however it is not rare because m_next_addr_send is 0 initially.

Therefore, what this flag currently seems to do is prevent initial self-announcements of outbound peers from being relayed to other peers.

I'll update the PR description with above info later this weekend and will also address @sdaftuar's comment - marking the PR as Draft until then.

jnewbery · 2020-09-05T09:38:37Z

I just realized that fGetAddr seems to be completely broken/ineffective after setting up a node with extra logging for fGetAddr on mainnet with current master...

Wow, good observation. So, this flag doesn't even do what it was introduced for. Strong ACK on improving the comments/logic here, but I also still think removing it entirely is fine. Our "don't relay getaddr responses" logic is enforced by the vAddr.size() <= 10 clause.

mzumsande · 2020-09-06T22:02:11Z

To elaborate on my last comment: The typical behavior seems to be that in the first SendMessages() loop after being successfully connected, our peer will self-advertise (because m_next_local_addr_send == 0 initially) and send us an ADDR without further delay (because m_next_addr_send == 0 initially), sending the GETADDR response later. It probably depends on the timing of when the VERACK vs. GETADDR messages are received by our peer, but from my node logs, this is the order of events in almost all cases.

So I think that there are now two separate issues:
1.) Considering that this flag mainly prevents initial self-announcements from being gossip-relayed and seems to have little to no interaction with GETADDR: Should we embrace this unintended change in behavior and rename the flag to fDontRelayInitialSelfAnnouncements or similar? As @naumenkogs above, I don't really see a good reason to do this.

2.) Even though fGetAddr doesn't seem to work as intended, @sdaftuar's privacy concerns still poses the question if there should be some mechanism suppressing relay of GETADDR answers beyond the vAddr.size() <= 10 limit. I think that it should be an extremely rare event for our outbound peer to have an addrman with < 43 addresses. Even starting with an empty addrman and querying some DNS seeds should get us over that limit, which should be no issue after having completed the first GETADDR interaction with a typical outbound peer. So I am not sure how realistic the scenario is.

I think it might be more problematic for privacy sending GETADDR responses < 1000, because we communicate the size of our addrman (via 23% factor) and signal that we might be susceptive to an eclipse attack if our addrman is small.

I'll update the commit message and the PR description (will leave the originally assumed changes in behavior in for now, because they were quoted in this discussion).

`fGetAddr` is an old flag that indicates an ongoing GETADDR/ADDR interaction. Introduced in dd51920, it prevents further relay of received ADDR answers to other peers via the `RelayAddress()` gossip mechanism while set - probably to prevent ADDR traffic from being dominated by large GETADDR answers. It also used to be possible to receive a large GETADDR answer spread over multiple ADDR messages (each bounded to max 1000 addresses), so the condition `if (vAddr.size() < 1000) pfrom.fGetAddr = false;` was meant to ensure that `fGetAddr` would only get cleared once the last ADDR of the package was received (typically with < 1000 addresses). However, `fGetAddr` currently does not work as originally intended and is no longer useful: - typically, the first ADDR we receive from a peer after sending the GETADDR is not the GETADDR response but the self-announcement of our peer. If this is the case, `fGetAddr` acts on it (preventing its relay) and is already inactivated once the GETADDR response is received. - 5cbf753 introduced the condition `vAddr.size() <= 10`, so messages with more than 10 elements won't be relayed further regardless. - since bitcoin#5419, we do not get more than one ADDR message in response to GETADDR because `vAddrToSend` cannot have more than 1000 elements. Removing the flag will cause initial self-announcements from outbound peers to be gossip-relayed.

jnewbery · 2020-09-07T09:07:29Z

ACK 628812a

naumenkogs · 2020-09-08T07:53:01Z

So there are 2 issues pointed out by sdaftuar, I'd consider them separately:

Privacy of a Bitcoin Core node with few AddrMan records making a new connection to a Bitcoin Core node.
I agree with the concern, but according to mzumsande experiments above, the fGetAddr flag is ineffective here, so we already have this problem. The problem probably should be addressed in a different PR. If we can't come up with something better, we always can expand ADDR message to say it's unsolicited (not GETADDR response), I don't see any (privacy) issues with it.
Some other software on the network. The right thing here would be to consider all more or less reasonable scenarios of the communication between our Bitcoin Core node and other software. For example, if that software don't self-advertise at all, it will actually get the privacy benefit described in (1), because fGetAddr flag won't be cleared on initial self-ad. Now, maybe that software relied on this privacy benefit? (I doubt though).

I'm leaning towards reACK, but before that I'd rather us make some efforts to make sure (2) is not a concern (maybe an ML post is sufficient).

naumenkogs · 2020-09-08T09:53:00Z

Actually, fGetAddr as currently working (set to false after initial peer's self-ad) are protecting the initial self-ad from the privacy leak I've pointed out here. But I'd repeat that this feature should be made explicit in a follow-up rather than a random side-effect.

jnewbery · 2020-09-08T10:37:44Z

@naumenkogs , I don't understand this:

We receive their self-announcement containing only 1 addr and keep fGetAddr=true

Why would fGetAddr stay as true? I think it's flipped to false in this scenario.

naumenkogs · 2020-09-08T10:50:48Z

@jnewbery this was a mistake (see I updated it), but the example remains valid.

sdaftuar · 2020-09-10T23:20:07Z

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end, eg by my suggestion here to delay the initial addr message we send out to inbound peers in order to mitigate the ambiguity between the response to an initial getaddr and the advertising of the peer's own address?

naumenkogs · 2020-09-11T13:57:24Z

@sdaftuar

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end, eg by my suggestion here to delay the initial addr message we send out to inbound peers in order to mitigate the ambiguity between the response to an initial getaddr and the advertising of the peer's own address?

I'm not 100% persuaded that fGetAddr better stays (the only valid justification I see at the moment is potentially breaking other software?), but I think this idea would make the flag work as intended.

sdaftuar · 2020-09-11T16:10:01Z

I'm not 100% persuaded that fGetAddr better stays (the only valid justification I see at the moment is potentially breaking other software?), but I think this idea would make the flag work as intended.

It seems like it should be a reasonably good guess that if you send someone a getaddr, the next addr you get is probably in response. However, since we send out a getaddr to outbound peers on connection, and because we send out an addr for ourselves to inbound peers, also usually on connection, we've created a natural race condition between nodes running our same software.

Fixing that race condition (usually) for our own software is pretty easy; and that then reduces the question posed by this PR to "is it important to keep track of when an addr message is in response to a getaddr, or not?" And I tend to think that the answer is probably yes; although addr relay is not something specified in any BIPs, it makes sense to me that responses to a getaddr are of a different quality than general addr-relay, so we might as well try to detect those situations when reasonably possible and relay differently.

As for what other software might exist: in addition to other full-node implementations out there, I think the dns seednodes can be used as addr-fetch peers by Bitcoin Core, in some situations? I haven't tried to exercise it carefully, but my understanding is that the seednodes are custom software, which potentially differ from each other and from Bitcoin Core.

mzumsande · 2020-09-11T17:32:29Z

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end.

In this case, I think that we should change the if (vAddr.size() < 1000) pfrom.fGetAddr = false; condition which, in the typical case of receiving a single GETADDR answer of size 1000, prevents us from relaying the next ADDR message of our peer, even though this one would not be related to the GETADDR interaction at all.

I think of fGetAddr as an heuristic machanism meant to cover the typical case (obviously it does not help at all with adversarial situations), so I think it is justified to adapt it, if the typical case has changed a long time ago - especially if we have a second mechanism (size limit <=10) preventing GETADDR answer relay, which, even though not direct, seems less fragile in many respects compared to a timing-based one.

sdaftuar · 2020-09-11T17:47:51Z

In this case, I think that we should change the if (vAddr.size() < 1000) pfrom.fGetAddr = false; condition which, in the typical case of receiving a single GETADDR answer of size 1000, prevents us from relaying the next ADDR message of our peer, even though this one would not be related to the GETADDR interaction at all.

Yes, I agree that makes sense and better matches what we typically are seeing on the network.

mzumsande · 2020-09-15T19:02:51Z

Ok - I will open a PR in the next days for that. Will close this one and link it from the new one, because it would become a bit confusing with another change in the PR description and discussion referring to old versions.

sipa · 2020-09-15T19:04:35Z

There is some more discussion here, BTW: bitcoin/bips#907

DrahtBot added the P2P label Aug 24, 2020

jnewbery reviewed Aug 25, 2020

View reviewed changes

src/net_processing.cpp Outdated Show resolved Hide resolved

src/net_processing.cpp Outdated Show resolved Hide resolved

DrahtBot mentioned this pull request Aug 25, 2020

p2p: Unify Send and Receive protocol versions #17785

Merged

mzumsande force-pushed the 202008_rm_fgetaddr branch from 32bd4cb to ef98555 Compare August 25, 2020 20:28

DrahtBot mentioned this pull request Aug 28, 2020

[net] Cleanup connection types- followups #19724

Merged

This was referenced Sep 1, 2020

Periodically make block-relay connections and sync headers #19858

Merged

p2p: Improve diversification of new connections #19860

Closed

DrahtBot added the Needs rebase label Sep 3, 2020

mzumsande force-pushed the 202008_rm_fgetaddr branch from ef98555 to d3eaaf4 Compare September 3, 2020 18:48

DrahtBot removed the Needs rebase label Sep 3, 2020

mzumsande marked this pull request as draft September 5, 2020 07:02

mzumsande force-pushed the 202008_rm_fgetaddr branch from d3eaaf4 to 628812a Compare September 6, 2020 22:15

mzumsande marked this pull request as ready for review September 6, 2020 22:16

mzumsande closed this Sep 15, 2020

naumenkogs mentioned this pull request Sep 16, 2020

BIP155: include changes from followup discussions bitcoin/bips#907

Merged

mzumsande mentioned this pull request Apr 21, 2021

test: Extend functional tests for addr relay #21707

Merged

mzumsande mentioned this pull request Jul 9, 2021

p2p: AddrFetch - don't disconnect on self-announcements #22096

Merged

bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022

mzumsande deleted the 202008_rm_fgetaddr branch October 13, 2023 15:19

p2p: Remove fGetAddr flag #19794

p2p: Remove fGetAddr flag #19794

Conversation

mzumsande commented Aug 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnewbery commented Aug 24, 2020

Uh oh!

DrahtBot commented Aug 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

jnewbery left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mzumsande commented Aug 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

naumenkogs commented Aug 26, 2020

Uh oh!

mzumsande commented Aug 26, 2020

Uh oh!

laanwj commented Aug 27, 2020

Uh oh!

jnewbery commented Sep 1, 2020

Uh oh!

amitiuttarwar commented Sep 3, 2020

Uh oh!

mzumsande commented Sep 3, 2020

Uh oh!

naumenkogs commented Sep 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnewbery commented Sep 4, 2020

Uh oh!

amitiuttarwar commented Sep 4, 2020

Uh oh!

sdaftuar commented Sep 4, 2020

Uh oh!

amitiuttarwar commented Sep 5, 2020

Uh oh!

mzumsande commented Sep 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnewbery commented Sep 5, 2020

Uh oh!

mzumsande commented Sep 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnewbery commented Sep 7, 2020

Uh oh!

naumenkogs commented Sep 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

naumenkogs commented Sep 8, 2020

Uh oh!

jnewbery commented Sep 8, 2020

Uh oh!

naumenkogs commented Sep 8, 2020

Uh oh!

sdaftuar commented Sep 10, 2020

Uh oh!

naumenkogs commented Sep 11, 2020

Uh oh!

sdaftuar commented Sep 11, 2020

Uh oh!

mzumsande commented Sep 11, 2020

Uh oh!

sdaftuar commented Sep 11, 2020

Uh oh!

mzumsande commented Sep 15, 2020

Uh oh!

sipa commented Sep 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

mzumsande commented Aug 24, 2020 •

edited

Loading

DrahtBot commented Aug 25, 2020 •

edited

Loading

mzumsande commented Aug 25, 2020 •

edited

Loading

naumenkogs commented Sep 4, 2020 •

edited

Loading

mzumsande commented Sep 5, 2020 •

edited

Loading

mzumsande commented Sep 6, 2020 •

edited

Loading

naumenkogs commented Sep 8, 2020 •

edited

Loading