Skip to content

ETCM-446: Connection limit ranges#833

Merged
aakoshh merged 21 commits intodevelopfrom
ETCM-446-connection-limit-ranges
Dec 15, 2020
Merged

ETCM-446: Connection limit ranges#833
aakoshh merged 21 commits intodevelopfrom
ETCM-446-connection-limit-ranges

Conversation

@aakoshh
Copy link
Copy Markdown
Contributor

@aakoshh aakoshh commented Dec 4, 2020

Description

Currently the node will try to connect to nodes until their max-outgoing-peers is reached and limit incoming connection to max-incoming-peers. The problem is once incoming peers are saturated over the network it's difficult to find nodes to connect to, and this situation is likely to arise if outgoing >= incoming is configured.

The PR changes the connection strategy to be based on acceptable ranges:

  • min-outgoing-peers: The node is not seeking new connections if the outgoing handshaked connection count is above this number. Otherwise it tries to open up to max-outgoing-peers connections.
  • prune-incoming-peers: A number of incoming peers to try to prune if the incoming handshaked peer count hit the maximum and we are rejecting connections with TooManyPeers.
  • min-prune-age: A duration that a peer is minimally allowed to be connected before being selected for random pruning, and also a minimum amount of time that needs to pass between pruning attempts, to avoid hostile takeovers.

Proposed Solution

There are two ideas:

  1. We can try to aggressively connect to a lot of nodes in the beginning, expecting that many of them won't accept our connection, but once we have enough peers, we can stop at a lower level, only opening more if we fall below that.
  2. We can close down incoming connections if the overall number goes beyond a limit, so that new nodes can be accepted into the network instead. This also compensates to the aggressive over-connecting in the beginning.

The new default settings target at a 40/60 ratio for min-outgoing / max-incoming, to protect against eclipse attacks but also maintain that total(max-incoming) >= total(min-outgoing) on the the network level. max-outgoing is high to allow quickly trying to open many connections after startup, anticipating that most nodes will not be compatible or receptive.

Currently the pruning only kicks in when a new incoming connection would bring the number above the maximum. We could also prune periodically whenever the number equals the max. That would allow new joiners to be accepted immediately rather than when they try to re-connect after a short blacklisting.

To decide which connections to prune we take the average number of received responses over the lifetime of the peers, with higher being better. The statistics are collected for the last 12 hours, which is the same time as the bonding duration in discovery.

@aakoshh aakoshh force-pushed the ETCM-446-connection-limit-ranges branch 5 times, most recently from cf34f4a to 5ed1481 Compare December 4, 2020 15:31
@aakoshh aakoshh force-pushed the ETCM-446-connection-limit-ranges branch from 989f989 to e8e0ee6 Compare December 4, 2020 17:13
@aakoshh aakoshh marked this pull request as ready for review December 7, 2020 15:40
Comment thread src/main/resources/application.conf Outdated
Comment thread src/main/scala/io/iohk/ethereum/network/ConnectedPeers.scala Outdated
Comment thread src/main/scala/io/iohk/ethereum/network/ConnectedPeers.scala Outdated
}

def prunePeers(
incoming: Boolean,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this flag is necessary ? Maybe i am missing some thing, but in call sites we are passing here true

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we could prune outgoing connections as well, the communication is bi-directional on them, maybe some of them would be better replaced with other nodes, perhaps if they are slow. But in this PR it's just about the incoming ones, to free incoming slots.

private val outgoingPendingPeers: Map[PeerId, Peer],
private val handshakedPeers: Map[PeerId, Peer]
private val handshakedPeers: Map[PeerId, Peer],
private val pruningPeers: Map[PeerId, Peer],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure, this is necessary to not prune peer twice while waiting for its proper disconnect ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not to prune the same peer twice and also to remember the overall number of peers we are trying to prune, so that if the routine is ran again, it will know that it doesn't have to prune yet another 10 random nodes.

@ntallar ntallar removed their request for review December 14, 2020 18:25
Comment thread src/main/scala/io/iohk/ethereum/network/PeerManagerActor.scala Outdated
Comment thread src/main/scala/io/iohk/ethereum/network/TimeSlotStats.scala
@aakoshh aakoshh requested a review from enriquerodbe December 15, 2020 12:24
@aakoshh aakoshh assigned KonradStaniec and unassigned aakoshh Dec 15, 2020
@aakoshh aakoshh merged commit b368808 into develop Dec 15, 2020
@aakoshh aakoshh deleted the ETCM-446-connection-limit-ranges branch December 15, 2020 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants