Skip to content

Conversation

@mzumsande
Copy link
Contributor

@mzumsande mzumsande commented Aug 24, 2020

fGetAddr is an old flag that indicates an ongoing GETADDR/ADDR interaction. Introduced in dd51920, it prevents further relay of received ADDR answers to other peers via the RelayAddress() gossip mechanism while set - probably to prevent ADDR traffic from being dominated by large GETADDR answers. 

It also used to be possible to receive a large GETADDR answer spread over multiple ADDR messages (each bounded to max 1000 addresses), so the condition if (vAddr.size() < 1000) pfrom.fGetAddr = false; was meant to ensure that fGetAddr would only get cleared once the last ADDR of the package was received (typically with < 1000 addresses).

However, fGetAddr currently does not work as originally intended and is no longer useful:

  • typically, the first ADDR we receive from a peer after sending the GETADDR is not the GETADDR response but the self-announcement of our peer. If this is the case, fGetAddr acts on it (preventing its relay) and is already inactivated once the GETADDR response is received.
  •  5cbf753 introduced the condition vAddr.size() <= 10, so messages with more than 10 elements won't be relayed further regardless.
  • since Limit the number of addressses for broadcast to accumulate #5419, we do not get more than one ADDR message in response to GETADDR because vAddrToSend cannot have more than 1000 elements.

Removing the flag will cause initial self-announcement from outbound peers to be gossip-relayed.

Thanks to jnewbery for help with reconstructing the history of fGetAddr!

[Outdated, but left in for context]
Removing the flag will change behaviour in two special cases:
1.) If the GETADDR response is exactly of size 1000, fGetAddr will not not be cleared on current master, so that the next "regular" ADDR message by this peer won't be relayed - this does not seem intentional, changing it arguably fixes a bug.
2.) If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size.

@jnewbery
Copy link
Contributor

concept ACK

@DrahtBot DrahtBot added the P2P label Aug 24, 2020
@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 25, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just two minor style comments inline.

You should update the commit log to include your excellent description of why it's ok to remove this.

@mzumsande
Copy link
Contributor Author

mzumsande commented Aug 25, 2020

Thanks for the review! I pushed an update, implementing both style suggestions and copying the PR description into the commit message.

@naumenkogs
Copy link
Contributor

ACK ef9855580098654cdfc5844f8922518e6602878f.

It's good to get rid of the useless code. Even though it's only a couple lines, the burden of this variable is massive: I spent 10 minutes figuring what does it do, even after reading the PR description.

I tried to make sure we don't miss any corner cases, and I think this PR works correctly.

Regarding your commit message:

2.) If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size.

I'm not sure it's a bug in master. I always thought of <=10 check as something to limit the forwarding of unsolicited ADDRs. Now it may also forward a solicited ADDR. May it leak privacy somehow? I guess not, because we don't send them out immediately anyway.


As a side note, is this possible?

  1. Victim node V starts and connects to their peers
  2. An attacker spams V with a lot of unsolicited ADDR messages (of size <10 but from different Sybils) to make sure all ADDRs V sends to their peers is of size 11 at least.
  3. V's self-announcements (and other unsolicited ADDRs from V) are never forwarded by their peers, so only immediate peers ever learn about V.

This PR doesn't address this issue as a whole, but it removes one out of two mechanisms making this possible I believe :)
I think we should look further and see if we can fix it in full, perhaps in another PR.

@mzumsande
Copy link
Contributor Author

I'm not sure it's a bug in master. I always thought of <=10 check as something to limit the forwarding of unsolicited ADDRs. Now it may also forward a solicited ADDR. May it leak privacy somehow? I guess not, because we don't send them out immediately anyway.

I just wanted to note that it is a small change in behaviour, not sure if it is a bug. In any case, I don't think that this will occur often, because with MAX_PCT_ADDR_TO_SEND = 23 it would mean that an honest peer would need to have a really small addrman with less than 50 addresses to send such a small response - usually, GETADDR responses should be considerably larger.

Regarding your scenario:
I think this is a very interesting scenario and probably worth addressing (V's address could still be part of a GETADDR response though, even if it not forwarded).
I don't see though how this PR helps with this even a little. On current master, fGetAddr is set in the VERSION processing with an outbound peer, and immediately unset as soon as we get a single ADDR with size < 1000 from the peer (and can never be set again for the duration of the connection) - while your scenario entails repeated messages.

@laanwj
Copy link
Member

laanwj commented Aug 27, 2020

Though I had an idea what it was for, I don't think I've ever thoroughly understood this flag. Thanks for the explanation and ACK on removing it.
ACK ef9855580098654cdfc5844f8922518e6602878f

@jnewbery
Copy link
Contributor

jnewbery commented Sep 1, 2020

ACK the code changes in ef9855580098654cdfc5844f8922518e6602878f

If you touch this again, could you line-wrap the body of your commit log at 80 chars? (also no need to credit me in the commit log - this is your work!)

@amitiuttarwar
Copy link
Contributor

strong concept ACK! thank you! that variable is a distraction.

@mzumsande
Copy link
Contributor Author

Rebased and reformatted the commit message as suggested by @jnewbery .

@naumenkogs
Copy link
Contributor

naumenkogs commented Sep 4, 2020

ACK d3eaaf4


Another behavior change, or rather an elaboration over "If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size."

  1. We just connected to the peer and send them a GETADDR
  2. Before getting GETADDR, they schedule their self-announcement for us by adding their addr to vAddrToSend
  3. Their ADDR sending is triggered (see pto->m_next_addr_send < current_time)
  4. We receive their self-announcement containing only 1 addr and keep fGetAddr=true set fGetAddr=false
  5. They finally receive our GETADDR and schedule their response (can be flipped with (4))

I think this order of events is unlikely (need much luck for (3) to happen before (5)) but possible.

In that case,

  • before this PR, we would not forward their self-announcement, because fGetAddr=true at the time we receive it (we update it right at the end of processing it)
  • after this PR, we would forward their self-announcement

I think this is actually an improvement. I don't see any reason to withhold an initial announcement of the peer instead of forwarding it to other nodes.

Even if this had any privacy benefits as a side-effect (e.g., somehow less revealing of new connections), it had a very low chance of working. Maybe this should be a real feature tho.

@jnewbery
Copy link
Contributor

jnewbery commented Sep 4, 2020

ACK d3eaaf4

@amitiuttarwar
Copy link
Contributor

ACK d3eaaf4

CI failures look unrelated.

@sdaftuar
Copy link
Member

sdaftuar commented Sep 4, 2020

since #5419, we do not get more than one ADDR message in response to GETADDR because vAddrToSend cannot have more than 1000 elements.

If I'm understanding you correctly, you're saying that we can't get more than one ADDR message in response because our own software wouldn't do it. But what about other software on the network?

At any rate, I think the reason we have this variable is to track whether an ADDR message is likely in response to a getaddr message we send, in which case we don't want to relay the learned addresses to other peers, or if an addr message is an unsolicited relay from our peer, in which case we also want to try to relay. (The reason to not relay messages with more than 10 entries is to rate limit the addr relay, so that a spammer can't use up all the network's bandwidth by blasting out addr messages.)

I think one downside to relaying getaddr responses is a privacy leak - if you started up with a small addrman, and learned only a small set of nodes to connect to from one peer (and relayed that whole thing to another peer), that other peer might have a pretty good idea of who you're going to be connecting to.

Anyway I'd be hesitant to remove this bool without talking through how we view this scenario, as it seems odd to me to relay addresses that are in response to a getaddr message, since we expect those to be of a very different nature than unsolicited addresses being relayed to us, which you expect to be originating with some peer announcing itself. (I wish the response to a getaddr was a different message type than an unsolicited address being relayed to us!)

@amitiuttarwar
Copy link
Contributor

@sdaftuar hmm interesting, I'm trying to evaluate this concern, do I have the scenario right:

  • Node A starts up, connects to Node B
  • When Node A receives the VERSION message, it sends a GETADDR message to Node B
  • Node B has an addrman with < 43 addresses, fulfills the request, sends back 9 addresses randomly selected from its addrman (based on 23 percent calculation)
  • Node A receives an ADDR message with 9 addresses, passes misc checks in ProcessMessage, invokes RelayAddress, which then sends it out to 1 or 2 nodes? (if I'm understanding the role of nRelayNodes correctly), including Node C

and then your concern is at this point, that Node C is able to deduce private information? the connection between Node A and Node B? or more generally what Node A's options for connections are?

@mzumsande
Copy link
Contributor Author

mzumsande commented Sep 5, 2020

I just realized that fGetAddr seems to be completely broken/ineffective after setting up a node with extra logging for fGetAddr on mainnet with current master:

The problem is that the first ADDR we receive in the lifetime of a typical connection is the initial self-announcement of our peer! After receiving this ADDR message of size 1, fGetAddr will be cleared - it wasn't active a single time by the time I received the actual GETADDR response from a peer. This is exactly what @naumenkogs described in his earlier comment (#19794 (comment)), however it is not rare because m_next_addr_send is 0 initially.

Therefore, what this flag currently seems to do is prevent initial self-announcements of outbound peers from being relayed to other peers.

I'll update the PR description with above info later this weekend and will also address @sdaftuar's comment - marking the PR as Draft until then.

@mzumsande mzumsande marked this pull request as draft September 5, 2020 07:02
@jnewbery
Copy link
Contributor

jnewbery commented Sep 5, 2020

I just realized that fGetAddr seems to be completely broken/ineffective after setting up a node with extra logging for fGetAddr on mainnet with current master...

Wow, good observation. So, this flag doesn't even do what it was introduced for. Strong ACK on improving the comments/logic here, but I also still think removing it entirely is fine. Our "don't relay getaddr responses" logic is enforced by the vAddr.size() <= 10 clause.

@mzumsande
Copy link
Contributor Author

mzumsande commented Sep 6, 2020

To elaborate on my last comment: The typical behavior seems to be that in the first SendMessages() loop after being successfully connected, our peer will self-advertise (because m_next_local_addr_send == 0 initially) and send us an ADDR without further delay  (because m_next_addr_send == 0 initially), sending the GETADDR response later. It probably depends on the timing of when the VERACK vs. GETADDR messages are received by our peer, but from my node logs, this is the order of events in almost all cases.

So I think that there are now two separate issues:
1.) Considering that this flag mainly prevents initial self-announcements from being gossip-relayed and seems to have little to no interaction with GETADDR: Should we embrace this unintended change in behavior and rename the flag to fDontRelayInitialSelfAnnouncements or similar? As @naumenkogs above, I don't really see a good reason to do this.

2.) Even though fGetAddr doesn't seem to work as intended, @sdaftuar's privacy concerns still poses the question if there should be some mechanism suppressing relay of GETADDR answers beyond the vAddr.size() <= 10 limit. I think that it should be an extremely rare event for our outbound peer to have an addrman with < 43 addresses. Even starting with an empty addrman and querying some DNS seeds should get us over that limit, which should be no issue after having completed the first GETADDR interaction with a typical outbound peer. So I am not sure how realistic the scenario is.

I think it might be more problematic for privacy sending GETADDR responses < 1000, because we communicate the size of our addrman (via 23% factor) and signal that we might be susceptive to an eclipse attack if our addrman is small.

I'll update the commit message and the PR description (will leave the originally assumed changes in behavior in for now, because they were quoted in this discussion).

`fGetAddr` is an old flag that indicates an ongoing GETADDR/ADDR interaction.
Introduced in dd51920, it prevents further
relay of received ADDR answers to other peers via the `RelayAddress()` gossip
mechanism while set - probably to prevent ADDR traffic from being dominated by
large GETADDR answers. 

It also used to be possible to receive a large GETADDR answer spread over
multiple ADDR messages (each bounded to max 1000 addresses), so the condition
`if (vAddr.size() < 1000) pfrom.fGetAddr = false;` was meant to ensure that
`fGetAddr` would only get cleared once the last ADDR of the package was
received (typically with < 1000 addresses).

However, `fGetAddr` currently does not work as originally intended and is no
longer useful:
- typically, the first ADDR we receive from a peer after sending the GETADDR
is not the GETADDR response but the self-announcement of our peer. If this is
the case, `fGetAddr` acts on it (preventing its relay) and is already
inactivated once the GETADDR response is received.
- 5cbf753 introduced the condition
`vAddr.size() <= 10`, so messages with more than 10 elements won't be relayed
further regardless.
- since bitcoin#5419, we do not get more than one ADDR message in response to GETADDR
because `vAddrToSend` cannot have more than 1000 elements.

Removing the flag will cause initial self-announcements from outbound peers to
be gossip-relayed.
@mzumsande mzumsande marked this pull request as ready for review September 6, 2020 22:16
@jnewbery
Copy link
Contributor

jnewbery commented Sep 7, 2020

ACK 628812a

@naumenkogs
Copy link
Contributor

naumenkogs commented Sep 8, 2020

So there are 2 issues pointed out by sdaftuar, I'd consider them separately:

  1. Privacy of a Bitcoin Core node with few AddrMan records making a new connection to a Bitcoin Core node.
    I agree with the concern, but according to mzumsande experiments above, the fGetAddr flag is ineffective here, so we already have this problem. The problem probably should be addressed in a different PR. If we can't come up with something better, we always can expand ADDR message to say it's unsolicited (not GETADDR response), I don't see any (privacy) issues with it.

  2. Some other software on the network. The right thing here would be to consider all more or less reasonable scenarios of the communication between our Bitcoin Core node and other software. For example, if that software don't self-advertise at all, it will actually get the privacy benefit described in (1), because fGetAddr flag won't be cleared on initial self-ad. Now, maybe that software relied on this privacy benefit? (I doubt though).

I'm leaning towards reACK, but before that I'd rather us make some efforts to make sure (2) is not a concern (maybe an ML post is sufficient).

@naumenkogs
Copy link
Contributor

Actually, fGetAddr as currently working (set to false after initial peer's self-ad) are protecting the initial self-ad from the privacy leak I've pointed out here. But I'd repeat that this feature should be made explicit in a follow-up rather than a random side-effect.

@jnewbery
Copy link
Contributor

jnewbery commented Sep 8, 2020

@naumenkogs , I don't understand this:

  1. We receive their self-announcement containing only 1 addr and keep fGetAddr=true

Why would fGetAddr stay as true? I think it's flipped to false in this scenario.

@naumenkogs
Copy link
Contributor

@jnewbery this was a mistake (see I updated it), but the example remains valid.

@sdaftuar
Copy link
Member

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end, eg by my suggestion here to delay the initial addr message we send out to inbound peers in order to mitigate the ambiguity between the response to an initial getaddr and the advertising of the peer's own address?

@naumenkogs
Copy link
Contributor

@sdaftuar

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end, eg by my suggestion here to delay the initial addr message we send out to inbound peers in order to mitigate the ambiguity between the response to an initial getaddr and the advertising of the peer's own address?

I'm not 100% persuaded that fGetAddr better stays (the only valid justification I see at the moment is potentially breaking other software?), but I think this idea would make the flag work as intended.

@sdaftuar
Copy link
Member

I'm not 100% persuaded that fGetAddr better stays (the only valid justification I see at the moment is potentially breaking other software?), but I think this idea would make the flag work as intended.

It seems like it should be a reasonably good guess that if you send someone a getaddr, the next addr you get is probably in response. However, since we send out a getaddr to outbound peers on connection, and because we send out an addr for ourselves to inbound peers, also usually on connection, we've created a natural race condition between nodes running our same software.

Fixing that race condition (usually) for our own software is pretty easy; and that then reduces the question posed by this PR to "is it important to keep track of when an addr message is in response to a getaddr, or not?" And I tend to think that the answer is probably yes; although addr relay is not something specified in any BIPs, it makes sense to me that responses to a getaddr are of a different quality than general addr-relay, so we might as well try to detect those situations when reasonably possible and relay differently.

As for what other software might exist: in addition to other full-node implementations out there, I think the dns seednodes can be used as addr-fetch peers by Bitcoin Core, in some situations? I haven't tried to exercise it carefully, but my understanding is that the seednodes are custom software, which potentially differ from each other and from Bitcoin Core.

@mzumsande
Copy link
Contributor Author

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end.

In this case, I think that we should change the if (vAddr.size() < 1000) pfrom.fGetAddr = false; condition which, in the typical case of receiving a single GETADDR answer of size 1000, prevents us from relaying the next ADDR message of our peer, even though this one would not be related to the GETADDR interaction at all.

I think of fGetAddr as an heuristic machanism meant to cover the typical case (obviously it does not help at all with adversarial situations), so I think it is justified to adapt it, if the typical case has changed a long time ago - especially if we have a second mechanism (size limit <=10) preventing GETADDR answer relay, which, even though not direct, seems less fragile in many respects compared to a timing-based one.

@sdaftuar
Copy link
Member

In this case, I think that we should change the if (vAddr.size() < 1000) pfrom.fGetAddr = false; condition which, in the typical case of receiving a single GETADDR answer of size 1000, prevents us from relaying the next ADDR message of our peer, even though this one would not be related to the GETADDR interaction at all.

Yes, I agree that makes sense and better matches what we typically are seeing on the network.

@mzumsande
Copy link
Contributor Author

Ok - I will open a PR in the next days for that. Will close this one and link it from the new one, because it would become a bit confusing with another change in the PR description and discussion referring to old versions.

@mzumsande mzumsande closed this Sep 15, 2020
@sipa
Copy link
Member

sipa commented Sep 15, 2020

There is some more discussion here, BTW: bitcoin/bips#907

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
@mzumsande mzumsande deleted the 202008_rm_fgetaddr branch October 13, 2023 15:19
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants